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ABSTIACT 



The STUDENT problran solving system, progrnaned in LISP, ac- 
cepts as input a comfortable but restricted subset of English which 
can express a wide variety of algebra story problcns. STUISNT finds 
the solution to a large class of these probI«w. STONSNT can utilize 
a store of global information not specific to any one probl^, and 
may make assimiptions about the interpretation of ambiguities in the 
wording of the problem being solved. If it uses such information, 
or makes any assia^tions, STDDBNT conmunicates this fact to the user. 

The thesis includes a siamary of other English language ques- 
tion-answering systems. All these systora, and StmoasST, are evalu- 
ated according to four standard criteria. 

nie linguistic analysis in STUDENT is a first approximation 
to the analytic portion of a semantic theory of discourse outlined 
In the thesis. STlHaSTF finds the set of kernel sentences which are 
the base of the input discourse, and transforsu this sequence of 
kernel sentences into a set of siimiltaneous equations which form the 
semantic base of the STDMNT system. STOMKT then tries to solve 
this set of equations for the values of requested unknowns. If it 
is successful it gives the answers in English. If not, STUDENT asks 
the user for more information, and indicates the nature of the de- 
sired information. The STU1H5NT system is a first step toward natu- 
ral language coimunication with computers. Further work on the se- 
mantic theory proposed should result in nwich more sophisticated 
systems. 
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CHAPTER I; IiraOPOCTIOif 



The aim of the reseaicch reported here was to dlsccnrer how 
one could build « computer progrjam lAlch c|6uld con^unlca^ with 
people in a naturfl langi^gf within some i^stricted prol^lm. domain. 
In the coarse of (his investigation, I wrole a set of cdnf^er pfro- 
grams, the S'ltUJUHT fystem;, «»iich accepts ap li^ut a com^orfcable, iMit 
restricted subset o£ Ipiglish which caa be used Ito e^preut.^ wide 
variety of *lgebra jtoty ftdblen*. Tlie px|Uei^ ^M^wn la Figure 1 
illustrate some of the cmviuniciation and i^roblegf selvii9s ^aptbil* 
ities of: this system. ; 

In the foliowiiig diicijssion, I shall usi {Jbcsses suc^ as 
"the comfiiter («ndttr|t«ids:^ fiaflish*'. Jxl al| |u#i|:i^es, ^ "Sn- 
glish" is -just the rMHtricted ««b»«t o£ Jia«lisb wWib- i»--a^ 
as input for the computer pr^ram under discussion. Sin addition, 
for purposes of thi« repcYt 1 hav% sdopti^ the follcj^lnl opera|ional 
definition of undertltaiid%g.g A cionyutciS ynderst^ y a fubset olyEn- 
gllsh if it accepts IniMit sexs^en<#s «9il^ ari mes|>ez| of this fllset, 
and ansiiers questions biased #i ii^onSatli|i centa^pedvilnithe inpat. 
The STUEKHf system vuid^stands Ex%li4Ai ^thls s^e^ iS; 



A. -The Problem Contett. of the Si;gDEgI ^tem. ; ,; 

In construct lii| a quest ion-ranswei^j^ gystcli, tiany f^oblcnsar 
are greyly ^ia^li&.^ if the pi^bleipr cci^|x| is lef|ri^4* ^%fr 
sim^liftxat ion resulting from a:hHt re^i^***!* «io||e4;iit the..^i- 
DENT system, and thf ;reasons theii« 8ll||>il.flc«ticMEd |iil^, Williw- 
discussed in detail in th« bo<^ 9ft%^ z«pok. :^^ i^ 3 

The 8TUBBST system is designed to answer questions embedded 
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Figure 1; Some Problems Solved by STDTOMT 
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In English language 8tatea«n|siq£ aljete 

those sbofim incFlgnfe 1. STOKBT do«8>ithis; byjoeastlractlE^i^friiiiW 
the English input a corraspoadingsf trof J ■lg«lto4i»3eqddtioi»9'an4''' 
solving this ■«« of •ttfttions ferithf e«q|iie«ee4.rattilovdl^ I ^ir' 
needed, STBD^ffafeas aceess tora sfcdfe of^-^'glotolHotitfoirwiCion,^ 
not specific to_'an^j|Mirt:ltaliarprobl«a^i!afidxje«aJsttTicv« Relevant' 
facts and equations frcm this store of inf Ssiidfette» '> nHJDWf cOiB*> 
ments on its progress in solving a problem, and can request the 
help of 6he questioner if it .gets sbnck* 1 ■ ' 

There ere ia-wnober t>fiEea9<m8 3«ii7^^^^ ^tos« the conCext of 
algebra ate«iry9ps(|bUna in «lildh to devd^Ioi^wteieSiillfiiee wlticai «0«t4 
allow a ccn^uter problem solving 8ystemt«(fi,iee^e fiMnndal'4ai^|aeg<i! 
input. First, we know a good type of data structure in which to 
store inf 0WBetion i i>ec 4ed '€e:«nalMtf ;^iq«(ei«3dbeRM^'i»i£Wi»f«Diie«»cr- ' 
namely, al^a>i»ie^«!qaetlaaasr. nsezie .exists JwMl tliMWa^lgtfttl^iAe 
for detkiciAg Jat^MOiiaUsmiiaqililBdfciiaatiAe affiBtiMmiciaiet i«f 
values fipar puri^ii«arl4»; ymeiiMU» iMi<iTt:mtKtdi^ '^iAm:mt^ ot ietitmet^^ 

In adddtifln, I f eOJk thafe itbnS^e <4ii«8 ^ saiiag«d>lw evlM^ 
English 4a v^icbjflmi^titPM^^^o 

pressible. A large number QSM$(tm»>pti^ji':fXQ\£bmm.'miee inrvi4»l»te 
in first year high school text books, and I have transcribed sense 
of them intg, SiHJ>MiE:^ s .ixpKU^WtagX^i^^ i$iaam ^aj^jjpwisetgn^tt ai i WFt - 
ing taefc iMrfiW^ pjHSfiaxmd £byinigaan8,;<ebd>c«tft« .£l«!tiiait>toe'9i»»ce8s 
from input 1;<| jnoub^oa of the> «qittlMiaaH>iiKS piae^iiwimlv »ct am Msb^ 
tain a ne«euce «£ ccen^Mmiscm l>e$iie«x^ tM farf u i—JateM': gp fr lBlliBgilT> ■' ' 
and of,,«- 'hoMm^ oa;:^tin&-'^8«n' -jproblMip].!: ^.i^r- tndc^-seMiB ''fs^igeatfr'j^^ -'^ 
IBM 709A: enMrns* avssI: jqiMiet^ions tiutti ^it: fcaRBFtaaMtttt am fel«r dr' 
faster tihan huanmir tnqriii^; the ««KBri^3>bi«tt^ i lBs<#ail||^ii^^!bis^ooiai» 
parison, one «faoulld .t^hHetbeFithe Inuie^wpMI icsCtfi^'^BC'^O^i^l^^^which 
can perform over one hundral thoaifcand> adrftti«B«» ^cveetttid/i; 



%M«t»^-. 



B. Reasons for Wanting Natural IiaBgtta|telBapgfc» 

Whir ishould one want to talk to a immfmteT In English? mtetre 
are man$rr tongueft the coaipater already imdctjataads « >sttch aa FORtKAN, 
CXMIT, LISP, ALGOL, O0BOL» to nane Just a r f eir. iSipae serve ade- 
quately as coBBwaicatioti media with tbencmiqyaterjfor a large' class 
of problems* A more pertinent question is really^ jhin Is English 
Input to a coBtputer desirable? t ; ?' 

English input is desirable, for exaaple, if It Is^mecesaary 
to use the computer for retrieval of information from a text in 
English. If a ccmputer could accept Englisshisiputtaiuehiaformat ion 
now recorded only in Engl isb would be availabLe.foc'^iedi^mQer use 
without need for human translation* f ; i 

A computer which ^understood E^glii^ wouM beF.nffi:e aceess^lble 
to any speaJfCer o£ Englfialt, whether or noC he «wi crarfsed - id 'any 
"foreign" computer tongue. Boor a aingle ^ot «t'%li« 'CRNi^ti^er with 
a question nat likely to be repeated!, 14:^ wwiXd^fi^' b#>Wior^«s9lKile 
to train the user in a specialized language. For fact retrieval, 
rather than 4(KSiiiKnCr'efer leva 1, &igiiiii^< la «gO«d^ 

stating queries. For a gffiad descrii^^don of the dlTf filHMbees between 
fact and document retrieval, see &>oper 



Fro^amming languages are process os^ented. One cannot 
describe a problan, onlyi a jnetfiod &>r finding a s0lAtCic»t. to th« prob> 
lem. A natural language is a convenifet^iveKi&^e fo^^^lipoviding a 
description of the problem itself, leaving the choil^iiofi pr<J*ce8slng 
to the problem solver accept ii^ the ijqputV In atr^ej^tTE^bSfe ca8%, one 
would like to talk to the computer atout a pr^oblrai^ wi,!^ apt>ropriate 
questions and interjections by the coiniHEter on aB&WptisaiM <tt finds 
necessary,, until the cosqmter claiats that rthe'pir^lem i^ndw well 
formed, and an att.»q>t at solution can be made* 
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Finally, nMn's abLlity tos u»e synb&Ifi ^msA lmi%\m^ iM 6 prltte 
factor in his intell^ence, and iJ ■we c»in^}»a3m 4io««(eo^:iBiske a> CdM- 
puter oBd«ratand aijxatuirial language, veu* 11/ 4j«^e*i^kett%a W.gi step 
toward creating am "arfcificially ixiiteitti^eta:''^^oa^piiter <^ . 



C. Criteria for Evaluating Quest ion- Answering Systems. 

We have de£ii^dHand«rstawliY^I lti^r|ie»inr of^.:«(i^ ability to an- 
swer questions IhEaglisfei A mmkmr ef^^ffbMtimnrmomtvix^ ^^^ktema 
have been Jmilt, aiidi «dll bedessrribMt^tii-tte: nwrtt^ectlom In fclils 
section, we ishall give ^ neraber of ctttmtim fiW-2 e«tal^Batliig=f^tie^i©h- 
answering systems. 

In many systems there is a sepa|r«tlow3®€ 4at« i^ut aMVq«e6- 
tion input. For all systfems under coB»l«l«Tlt l»tf J th# In^i^tte^fe^itins 
are in Etiglish. Bie input data mayTbe eltli^f'* iliv't*i^|«h 6^*-^ in a 
prestructuredformatj -e*g^ a tree ortMlifer^lf^M^b :|hsf!En^l^lj fl^ta 
input mayr be usadHis 'a data bate-aiS ■«8^^»t«iaif^4»?'iii|^i a #tr«6tttr*d 
information stor».S*sBtton8,'inv|ii8edq>%d*nestttVet'=^f' English 4uei-i 
t ion- answering systems #0>j calls thdsi i«^ii^iri8 dliftg ii^fef^ctujfid 
information store "data base question-answerers", as opposed to 
"text-based <}«estion*.atlB»erers'* wirteh ^ittie^i < fagigsfrSA thi *e¥lginal 

text. ■ ', -■ ..!■') ;Mr;1w 'i-'';^, -Tijrin'? 'i -.f :.; , ./-^i..-: 

Ifee extent of solders t»iidt»g ^f «^fa«iiii(M^aitsweting system 
can be measured «lda^ fcht^ee dt£f«*efit5-dM*nsid«Si Si^feactic^steiati- 
tic and de<hH?tive» Aloncg the syfttAjtit dfiH^Ifott ^ne ciftt'uiawure- 

the grammatical complexity allmiifliM-%ii 'mp&^^8diitt§fi6m^ 'i^mM -mif- '■-''■ 
differ for the data input and question input. In the simplest case, 
one or stmie Bmall mn^r <sf i&^ed f&>iai« sirtt^ttdetf^Wrc Wlffo^ia^ 
puts . Less restricted iwpiits twiTys iiid^um^ *^etic«# %Kfc»"earf *^ 
parsed by a ^1x5^ grantnar. The "rSear^^Sity gi^ilAii* te't(S W ^raiftnar* ' 
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of all of Etiftlirii> tb« laas r!B«taricted ijfc t|ife Inpwt-i ' l«c«ike t«ct- 
based i^^ti<mi^mcisvntismtm tttaxept ma iapt^ <aii|r «txi«s^ i>i3«oritet with-' 
out fupi:h»vqptoc99»iin^t tS^y tOiife mo ^pa^^ 

However , the !Sftct-wsAvimf» 1 pcQfi^:ln& Ba.y co^ te. «W* t»i i^tr«ct 
Information from those portions of a text with less than some maxi- 
mum syntactic coaq>lexity. 

In 4ata baaiB qiwstiofi<^iisweriag^ a^Wtansr^ imaL^^Tiseribadw teid- 
tionsh^FP* between «i!0si^^ iov eh^ctm , mii^ hei refitm»k)9ftMhie itt the 
inforafH^ioii store. OtiMKF infozaBtiian any lM:4isiiardttd> or 1^k» 
This- if: 9 liai^atiUip ilk the- sctoMitlc diaeMdon'of awlirsta^ing. 

In order to obtain answers to questions not explicitly given 
in the 4»Pi*» »^cstj4«-wwsietlBify»t«at wait ,havftff*h« power tolper- 
form sppe 4edpc^i0n«» I3m ftrttStwre of>lih« iaiAnuitUa stdreiiuiy 
facilitalje such dediis^iv«: al^ility. Illift rangto oiqdiidB^ivet:abi|l4tiy 
is measured flpn^Che dedttesivci«(itte»8io0!Ofnua|«*st«afilng. Th6 
struc^UTie qi t^e i^ft^rmpt^fn, slor^aVay alsotiaidi^ ln» selecting 6nly 
rele^§§ti itaterif lifof ,v8jB in th«< jieduetiveqvMCliMi^ansveriBg pro'^ 
cess, thus 4ii9i;oviag lite: «£f^ieaayef';t6c.syieap 

4aQt^er criteria clottlys«lat«<t to the ^aaBteBt of ooder^ 
standing, is the facility with which the syntactic, semantic, or 
deductive abilities of a quest ion- answering system can be extended. 
In the be;«t:v<:a«e 9B# (^«ul4 |Apifv« (lie§«yftfc«B»*blac mayMatmaiaa 
by talking to it ia^ngUa^f 4U«riMi^jf««ily^ CMMrniiMbika^A^tab^d 
some new prQgr«pp toth«^y»*^mk QT m% m^T^%i m^cit^n^mtmit^ imply 
comple^f , r^piM^raani^g ;Qf t^ «mti^eo«litt«n# 

An ij|q|ioirttnt «ddit j^oxial, ^MOMiMr^fSriim Iffit iMtrsi cif « ques- 
tion-an,pw«ringi sy»taiB is th» «w»iaQ/^>iqf )mmi%fdg(t:.iPii4iiM^ isOMrtmlj 
structure o£ the syatsfni thjat is ineceBiSXfy t^n use jifeic^^ f^tbi>eit t>ne 
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need not be ainirc of th« lafomatldft sterat^ Atrttttiurj^ nt«ilj»]t Itll. 
At worst, a thoroiigh kilbwltt4s« of tb« liit«iti^l #ir«»tiyHc^ may Ite <iec* 
essary to cotistTuet Sttltablc input*' -; 23 k. 



Al»>thcr flH^awre of the ii«<frf(a^Lne«8-id¥ « que««i4>li-mnweriieig 
systear Is lt« ability td Ittteraet tirlth th«f «MM .^'Itf' 4«te ton* ««*,'' 
a question Is asked and scmetime later an answer or rltfj^t^^f fall- ^^ 
ure Is given. When the question caxuwt be answered, no indication is 
given of the eauAe of failure, not dl««'«he'sy«teaf allied th«'|>erion 
to give^ any help. 1%is is typlie«l «dJf^th%^«ip^atteif of'a'^mBfeer of - 
Air Force ^uetysya teas < jay Keyset, >p«t9di^l>t(^iigttnl^^^^ Ih 

the best «*sev the syatea will ask the ttw&t'f&r'^*^^^ 
accept s(ig|;e«tiofi«bf apprbpriate'M^to««is^of Actleh. - ^^^ i 

In this section we have given four criteria for evaluating 
question-atwwwrlttg syistema; 1%ey ftay fee s«iB«tt^^»i-ii folldwtt: 

1) fittiitn4 oe "aadetstaiidiiii ii^ftya*iiMi:lc^ e^MtaihttC' ^d^ de> ^ 
,^ duci;life,fJbUUCleiB) ,^ , r.->,. \ ::.:. -^ -'.-■■?. ■,-. ^ 

2) Facility, for eiptendin«.al?l^Ue» J («yjita«i|:^i^^^^ 
deductive) 

3) Need by user for knowledge of internal structure of 

4) fetent 6f ittteraetibtt with ttiet 



D. EngM»h Eaniwafte Qttestleii^AtMiwerlnit^syBeeiMl ^ ■ ' 

Ih thl* sectidttV I ihitlt^l^sr « erltlfeil'sttito^y of »^%ari*er 
of BhgliSh language quegtion-atirtifetliig iyiCeiM» ^utilizing thie cri- 
teria outlined lA Che prevlbbs sectlbn.aaThiidliteittBildiii^wlll provide 
a context fdr the sectlbn of the coActMlt^ cMapiit iaiich* Siiiiiarlzei 
the eapiibilities of the SfUDBST syitan* tttlr^i*^a«icfi§feion olihe dlfS' 
ferent syntactic analysis schemes mentioned below, see the sbnrey by 
Bobrow (4) . 
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1) PbilU^s* 0f» 9# tibe »mclM»%. ^q^^fioar •nMreyii^i sy«t«a« 
was wr.|.tb^ i« 1940** MJT fcy A«l*oie iPbll44#» 4a« . sitt 

base system which accepts sentences vHiii^l^i^am: ^^jP»fBi^'b7^:»^''*^7 
simple context-free phrase structure grammar, of the type defined by 
Chomsky iC^) . , Addl*tpw4*;syntiMetAi^ijrfti$«rl#tAit»si ire«fi4if»:*h«* ; m^h 
word a«MM: b« in oaly-pne, erwanatical «!«#*, I«J that jS! B#atemce;ha« > 
exactly/ one- par^iag. . ,, ■'. - f.v v-'mo?; oa^^- i^^^-r, 

A par««d sentence i« tf.«?sf#|fpe4n4ntp^ fll liftoff Ive ele- 
ments , the »uWeetv v«yb> pfej^ct;* tiiflie* pfccfff , fndi ylace, p^raee i» 
the seirtence. AU Pth^r. inf(E^ri!#t4#n is^ ^i«ij|j;e||(g#i|#: d4fir#gard«d» 

Quest ioni^ are an«wfFf4^y pBt€b*|l& *¥e lisf i ftFqifc,*fe^tits#psfpf»ed ^ 
question against the U8^>|or ef^hiiJPtUJt^8e»|»iff» fHh«io^ p#|tfhfai8 
found, the corresponding sentence is given as an answer. 

Phillips' By§t9»h98 m de4^i^U^^it f^i-Hf^¥ 91^^^^^^ 
abilitiies would rje^qi^rp repf<9®f#fBi4ag.|;b«s#iWlffaf»o Ants^ti^ner must 
be aware that the system utilizes a matcliiibg;^|>titteiM «fiifeh does not 
recogtria^ synonyms; ii^^th^6f6tfe tlite SieAiiTteife ♦"Eh^'teib'feer^ ieats 
lunch at noon." will not be recognized as an answer tp the Question 
"What does the teacher do at twelve o'clock?" Whenvfk4J.Jips' system 
cannot find an answer, it fepprtf .pn^iy M<igi3©iAaEj»9l^;;NO?. KNOW)''. 
It provides for no further interaction with the user. 

2) Green. Baseball is arqjiWSt4ppf?«i»iW©fiM^; 83^^^^ 

and prp^iMpned at UJ*5pln Mll)9T««o|r i#f by i ^jlf n i 9)1^ f t : Chopf Hy #nd 
Laugherys(193l. ?t t»:f jd^l* lwi## fyft#«*,niiM*J§^i;$h#34ft|# is placed 
in mempry in * a pyestirucl vre^ t;f ee IffpaSs Jfef ^4«$a >sons Isfts of the 
dates, Ipc^i^pn, pppofiBg jteaiwf 9p#.isprei} of .sfWP 3APirif«niI^ag»e k 
baseball gaijies^ 0^|y quest ipjas tp th^ i»yftil»jfin>lw givi^ tmlftgliah** 
not the.data^ . , ■,^, ;■;>:: -; t^fan--.- :.!^ivj-,«i- . r.-.: .:-■■■ 
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Questidns must b«s sfmiile 8<^eMefi'>iriM ioreleti^fe^ 
claused, logical dt eodrdfnatfe cankmmMis. i Itiih mtdi re^it6ti6tii, 
the pi^ogram will acceiit any^qdesfcion cdttCSed tii'^Ms cbrttafne* til 
a vocabulary Hat quite adequate for takfag-^iitidns ^df 1jasf- 
ball statistics. Itt addition, tile parsfng^irdtttltiiVfo 
nlques developed % Harris (21) i^ffiU^t-HtMl a pariiftg for ^^t^^ 

The qu«stt<Jris muat pertain to Wtatlsttcs abbcrt baseball 
games found in the Infotfflatibn store, fifrie cfanncrt: asK ^^uesttdns 
about extrema, sucii Mi ^lg!resft« it6t& or^**f^e#t"^^i»et of games 
won. The parked queatfiori Iff tiransf^iAied '^ti^i ^flfaftiai^ 
tion (dr spec) ti/aty aM iSle quest^ioriiansweffng *dutl^^'tltlli^^ 
this canonical fdtte-for the meaning of Are ^e*tfdnf Fdt &tM^ie, 
the questldn "Who b«*at ttte tank^ers o« Jftify^4f«l wStiM be t^ansfonned 
into the "apec- lfst":'-~' •'- -13 m-ji :'■■■. 

Team f Idrfingy « Uew Yoi* ' a 

Date'^'* 'tti^.j^ly ■■4'-- : ;;r.. } 

Because BaaebafU doea not ^tlli4« ««gliah fdf^ dat^ Input, we 
cannot talk about deductions made from infdtffliti^^ fii4^ He i£ in sev- 
eral sentences. However, Baseball can perform operations such as 
counting (the ndatbel" To* ^^ames playe<i by miH^, ^ exa^ley arid 
thus in tfte aenae t*^ it is utlHsingaeViSt^f^a^par^tWtiata units 
in its atore, it ±6 petf dnttlttif dedutS^^ittf. — 

Baaeball's abilities can only be ext^rtW^d bjl*^ eartettslve re- 
progratffltting, tltbugh tfce techniques tftlfl^ed'have^dMfe^ gettfeifal^ applt- " 
cability. Becatfae the iprarsing prdpam ftas *^er^ liw^Ifete graiMrar, 
and the vocabulary list is quite comprehenil^ fi^ '^tlie^ pif db letf <lbtoain , 
the user needs no knowledge of the Internal structure of the Base- 
ball ptogr*a. No provision for Interaction with the user was made. 
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3) SlwBong. The Spffl|BK|3§5)t|%CT,iir^,,Ttfx«;^bfi^ q^e8t;iQiar.an- 
swert^g ,^y9tOT. 4e9l,gi!^d «i4 prQgi;fnn)«t^„ft S9q,,^XiS*ff«?B«i ^^^i? ^"^ 

has been trftn^^r^ed |Q/Mignet^lc? tape ,f OK j«|^ t;^,,tn£?ifii»ati9?ud - 
store. An .i«4e^ Ji«8 be^ prepared J. iafejafg^^f Iw^t if n o| fll th^ 
content woii^a in the tex;t» i.ef4»cl,^d|ag/«f»5d|4^ft "W>W*^' "^^f" 
and "birds," while excluding function words like "and," "the," and 
"of." All the cont^t wo?4» ot f /ju^^tJ^^T^x;^ ^^^^t^^ an4 i^ 
formation rich ffisctipns of the ,twt §ij*i, fetl^i«f?#4t ll-*®* J^®^^^^ ^ 
that are locally dei^e ifljy^ntent >joi:dft o<^t^«ii|# ii ^^.qu^^tiw^ 
For exw^le, if the jjijemtioipi i?ere 'W^|f4q jW9riBftof§^?"y, wi|b 
content ^oxAB •!wwri|w'^,^,''eat"^, the twQ.,WfAt;f«^ f«tG¥orn«a 

on the grwf." aad '"Moat ^rms v«u*l^n«*i^ !«Ff** 

At this tJaie;, the pr^grasi f^i^fo^fW a sjptt^^ic ,wal»3»48o9>§"t^ quest 
tion and of the sentences that may contain the answer ,, 4 'C«>e«'i"««» 
of the dependency trees of the question ,«i4iN"ri9ipilW<ieW?«» ™«y 
eliminate some irrelevant sentences. In the ex^^^f « j'^ilfda jeat 
worms on the grass" is eliminated because r'^cna" i« the,,o|j|ect of 
the verb "eats" instead of the subject as in the question. In the 
genera}., case, ,the r^iwiniijg *«a^e»J«e3 4^:^ r*iWR ^^*» !«^Wt/^f«nJ^^ ^^^'^ 
as possibly .anaweifliig tl?;^ ^^Wftftion.. _ j^ -.:-,,,, ,,.,., ^ tjuh-ib :',;-■■• 

SWTHBC is limited syntestioaUof by ityfi. j^fq^t^ta the cx-^ , i 
tent that the syntactic, ene lysis e^lw^t^; iifreie^^t^ ^ti^mef^s,, 
It makes no use of the meaning otf *i^. #t«tfpe#^iaxi,p<wda> and c^n^t 
deduce answers from information inq>licit in two or more sentences. 
Because the g?;«pn|ar,is independentjcpf fthe mJ^WC^* rt^ »y?taq^ic 
ability pf SYIOHE^ c-m H^^^^P^ m^^t^i^lj; «*si^. H^owsfvex;, bar 
fore it c^n becjO^ a gPCNd qu^tiw-anfwwfing^«y«,t|Mfcr!jS«^a^i^S«^ntic 

abilities will h«ye, to be a44«i. e 

SYHIHEX does not explicitly py/>vide for jiafc^^iop wit)^ the; 
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user, but because It is in^lemented In the SQCf time- shc^li^ system , 
(9), a user may modify a previous question if the sentences re- 
trieved were not suitable. Hie mechanism for selection of sentences 
must be kept in mind to get best results. 

4) Lindsay. While at the Carnegie Institute of T^e^hnQlog}^, 
Robert Lindsay (28) programmed jthe SAO SAM .question-^^swering system. 
The input to the system is a set of sentences in Basic English, a 
subset of English devised by C.K. Ogden (35^, which has «! yocabul^rj? 
of about 1500 words and a simple subset of the full English gram^ 
mar. The SAD part (Syntactic Appraiser and DiasraiUQer) of SAD SAM 
parses the sentence using a predictive analysis scheme. The Seman- 
tic Analyzing Machine (SAiM) extractsi ^rom these i>arsed sentences 
information about thp family relationships of people ment/on^d;, ,it^ 

Stores this information on a comput;er representation of the family 

J ■ :. . . .■: .-^ .--' ■■-•- ' •-;-;.. \(-oi:'i3q b z^ v'-^-ii iC '.y^H- b -. r .^■■.,,1' 

tree, and ignore»,all other information in the sentence. For example, 
from tjie parsing of "Tom, Mary's brqther, went to t^he store," Lind- 



say's program would extract the sibling i;elatijOnship of Tom an^ 
place them on the family tree as descendants .of the same mother and, 
father, and ignore the infornwtion about where Tom went. ^ 

The information storage structure utilized by SAD SAM, namely, 
the family tree, facilitates deductions from information inmlicit 
in many sentences. Because a family relationahip is defined in 
terms of the re],ative position (no pun intended) of two people in 
their fjamily tree, cfM^putation of the relationship^ is inde^en4ent 
of the numbei: of sentences required to place in the trp^ the path 
between the individuals. 

Extending the abilities of the SAD SAM system would require 
reprogranming. No provision is made for interaction with the user. 
No internal knowledge of the program structure is necessary if the 
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user restricts his queries to questions of family relationships, and 
his language to Basic English. 

Sy Raphael. The SiR quest ion- answering system (mnemonic 
for Semantic Information Retrieval) was designi^d by Bertram Raphael 
(38) at MIT. The SIR system accepts simple sentences in any of 
about 20 fixed formats useful for expressing certain relationships 
between objects. The semantic relationships extracted from these 
sentences are those of set membership, set inclusion, subpart, l,eft- 
to-right position and ownership. 

The information about the relationships between various ob- 
jects is stored in a semantic network, where the nodes of the net- 
work are objects and the relationships are indicated by directed 
labeled links between nodes, for exanqjle, if the three sentences 
"John is a boy," "A boy is a person," and "Two hands are part of 
any person" were an input to SIR, four nodes labeled John , boy , 
person and hand would be created. Included in the netwoirk would be 
a link indicating set membership between John and boy, another with 
a label indicating set inclusion between boy and person, and a link 
indicating hand is a subpart of person, with the number of parts equal 
to 2. 

Separate quest ion- answering routines are used for questions 
involving different relatiohships. Each routine takes cognizance 
of the interaction of various relationships, and can deduce answers 
from the linked structure of the network, indepement of the number 
of sentences which were necessary to set up tbese links. For exam- 
ple, by tracing the links from "John" to "hand," SlR would answer 
"YES" to the question "Is a hand part of John?" 

The SIR system can interact with the user. For example, if 
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told that "A finger is part of a hand'^ moA; asl^ed .'r^Sow mmy floors 
does John have?" it would reply "How many fingers per hand?" Then 
If It Is told "Every bai«l has ttve lii^xec»t" iJk^wm^ Mfm^r th^ 
question with ^Tbc answer Is 10". 

Any extensions of the SIR sy«t«n a«e«8»tta^. additional pro* 
granmlng effort, though it is ooasi^eably uM^Uit' ^^add Oteui »;n%»c- 
tic forms than new semantic relationships. Within the input limits 
of the 20 fixed format 8tatenenta> the f«it9r 9fl»t ««^ l^iow «|i(yt;hins 
of the internal structure of the infoieoMittQii »to«||^ ft^r^^^ture. 



E. Other Related Work. 

In addition to those questioa-anaweving ftyA^esnp described 
above, a number of programs have been written to translate English 
statements into a logical notation to cheek* the etf©«4»tency ol, a set 
of statements, and th« validity of logieel «ssuii«9li4 In t^et sense 
that, given a corpus trans fovmed^tso ione' legiff^ei|o|:atiea« aM a<lftil>fT 
statement, a logic-baaed system can amnires fehe <|»«mleo "Is t;hls 
statement (or its negation) implied by tbe CQffugt'j, ,SH«b logic- 
based systems are question-answering systeiia.. 

Cooper (12) and Darlington (14) b9tb beye pregraips which 
translate a subset of English into the pcepositle^al^alcalus. Oar-; 
lington is also working on programs vhieb e«a tie^sle^e EBgll9h infto 
the first order and second order paTedi^ete 4elCtiliBvA diffieult pwQb^ 
lem being considered by Darlington, in trying to handle implications 
of English statenem-s in terms of tfaeiinto^cal. trafuirlftipn^ is the 
determination of the proper level o£ an«ilysie imfsadW^ftisulaT ptwohrf 
lem - that is, fAether tp translate the^Iie^ttt, iat(9r8eeos4> Of 4e.r 
predicate calculus where proofs are vetji .dAMif uif % or to tjry to 
use first order predicate or propositional calculus to prove the 
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theorem^ ^artil l?fet*li*ffB HEdtea 4«^ tog!fet»ll^ • 

At tMe mi^t&aii 'tatmu. of ^M^Ha^nl*^ Ri3^^ ^2^^^ XS^tti ^ICfc) 
and Sillars (39) have designed a system lii' j|«iisch^p*BtwrerfI*n4;^BBglirfs 
language statoaents are converted to expressions In the first order 
predicamd'^miml^'^Oag^^iM'f^Mm amt^-mUlMetuM'tmaeMashhait laagitage 

Hoe«r«tlJry's A^£e«-t*lG«r -t3&) i-«ho«gh=notsd«stg««ditl> accept 
English Inpfifi %MtM «ltli«' «tg «a&i«ll^fitt^^fta»«-fdi; a iqa«seien4ansirdrlng 
system. Fischer Black (2) has programmed a system which can do all 
of McCarthy's Advice- Taker problems, and can be adapted to accept a 
very limited subset of English. The deductive systga^ta Black^s^^^ f 
program lt^iEiqalv8tei(e'^^dN'inirpl(6|rO0itio«at3ef)lottia«. ^ 



'A ttuAbet=>6f p^e^lir hiive dotit work Wa^i^gidtrftotly on the 
problett^df^s3$vlttg alg^li»if"tt(»«»d p«6ii«mioscMM ia Iflglish. Sylvia 
6arff£ikt€'^(l8> vrdte''iofta^ev'tfi!«liiie« tfte{>d«acttlMd:!tlk«qli«uriseles, 

she would ai^ fa piregrWlHing^^i iim^teW'tc^mdlf Hg«*ra-irord^prob*ia , 
lems, but-i4i«*tet: wrSfe . ^lii^Jf tttgr«i4 -aloiltidf 'th4ifih«*»4a«lCiiweB4 too ; 
vague to really be used; e.g^ tfttit Staeittgr-^tliatnoae^ahpuldkldanitlfy 
two variables' names which are only slightly different, but giving 
no good critfetfla-fefaa-lllghe dlf*etei«©« v^iiiiittilatoantSbjE ^thd«*J was 
taken f^Hto Ga^fiakle'A'papdiri^ qSaqwatft «-tttimi«ri9£":slaqpii£led state? 
ments >^! alg^tata'sCdty fr«ftl«iii ili<^' trwucvlltvdaaiiiiitravsfoviied. from 
prot4%ftal lft^i-'^r8t^yeatittlgakra:^««atft3b«ol«»7-'> ■ici--:,^ ouc. -Jv;-, 

Mlchael-CdlieibiMI' (11), at j!ITii^«tot« BrtcXmspapatrcteBcribdag 
a prog^iMtflof^'^s «iil«hBsM« ap th»'i«q««tia«il fori«i»eJ*ype« of aJbge-^ ^ 
bra story pt%14eiii^<Ms«» llAndlid'byiiSTIinEfia^vr ^^aeiotitbe speci«l: 
heurlstlct I aie £ol< "at^ lM:»bl«oi8^'^wea«^ l«Bfliic*j1^^t hte - 

Invented.-'^ -■,-x.;^i; ':-. ;^t:i :..■;: -■. ':;:';'.;•.-.?;,.•; ?-, -.j^ -, • :v'-;n ,; ;:; 
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construct this type of program, but again did not Implement these Ideas. 
He SiI£gB8tts^me«^Hldff for: tEitis£o|nHatisoc iDE£i#agli«triij;^t^4aiqjB4paf:Cbns 
which wasld TequjLre mch moires laa£fdsnatsi«niH«>^tfl»t3liUSib 4ili&ictils:>tJiied^;a^ c^s 
In th« 3TIIPC1S q|ntogra*b'<pd»tUsre£o^ 9msm mxxf a^pHtcaafltead-nf ^tftlatfiwrku: : 
The STUIKin! prograBt»nBldecB .#ODds^Htt|!a^boa«^ «te(^<iasrk«KMor Juiith:. .' is ' 
as little Imowledgei a1»t>utr the aeaie£eil££ WB^ l< ^ IJ 

with the goal «fe fiMthg a sdltrtriaH^ it»>^ tiEetjpiwtlculsBl: proid»».: ^q^ i- i 



■ciO'; 



/i ? ii^- 



i-i i'i;iS3";; 
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The ina^pose of thi* c^Hpter i« t® p«a?4die tecaixrii^tte* of analy- 
8 Is «iib0d4ed In 'titat STQDrai Fi»gsaBt> intzes a -vAsktsc itaet-ext-, m^ iJad±' 
cate ha» they woatW «Jt iaito • «H5e ^gniBrJii laB^|wa^£.p^arcpa»Ittg sys- 
tm. Wevill «lMcl?a»e to thte chapter la^^ooig* ot^MI■(■^ttE «*tMira- 
tion and ffiialyvls of xtl»<»ta^e» BTOpait oan *hea *•. tapwiiiere^ a 
first approjdaiRtrion to « co«pot«f ^laiaitiKtitHi o£ it*K. analytic 
portion of the theory, with certain restrictions on the interpreta- 
tion of a discourse to be analyzed. It will be evident from the theo- 
ry why analysis is so greatly sio^lified by the inqtosed restrictions. 



A. Lanttuage as Coaaunication. 

Language is an encoding used for connunication between a 
speaker and a listener (or writer ani reader). To transmit an 
"idea", the speaker must first encode it in a message, as a string 
in the transmission language. In order to understand this message, 
a listener must decode it, and extract its meaning. The coding of a 
particular message, M, is a function of both its global context and 
local context. The global contract of a message is the background 
knowledge of the speaker and the listener, including some knowledge 
of possible universes of discourse, and codings for s«ne siiqile ideas 

The local contest of a message, M, is the set of messages tem- 
porally adjacent to M. M may refer back to earlier messages. M may 
even be just a modification of a previous message, ami only under- 
standable in this context. For exnaple, consider the second sen- 
tence of the following discourse: "Bow many chaplains are in the 
U.S. Army? How many are in the navy?" 

In order for conmunication to take place, the information map 
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of both the listener and the speaker must be approximately the same, 
at least for the universe of discourse; al^o' the ij«^^ 
of the listener must be an approxlnate Invierse of tne ehcodiitg process 
of the speaker. Education Ih lanpiage ■Is, In large part, an attest 
to force the language proceissors of Afferent ped|^ uniform 

mold to facilitate successful coianunicatlon/ Ve are not proposing 
that identity in detail is achieved, but as Qulne so nicely put it 
(37): 

"Different persons growing up in the sane language are 
like differ-ent bushnt'titlmpiBd ^Ht| txUliBKl tsa tak» tiw shape 
of Identical ,<?^leE}»j^nts . The a.m%gm±caX details of twigs and 
branches will fbfHll i£he ele^ai^'Ifom'iSfe^entiy 
bush to busb:, butr tbe^ oMwraM mmmrnt^sr^Kilk* nis&iMlMo*'' 

As a speaker teansmlts successive lBe«8agtt«:G(»u:ezming some 
portion of his inionnatlon map, tAm Ustmam '^t*m «Bid«nsti»ids ti^ ttea«« 
sages construct* a Model of « ^'siteiat^UM^* ^Ihe icelstrjtem between the 
listener's model and the speaker's Inf ormat Icqi intap 4«l tAnvt f nan eat^ 
can be extracted the transmitted information relevant to the universe 
of discourse, IncludiriglnfoJilMtifflo^dwibie^leJ^ Che entire s«t 
of messages. The interlMl tttruceuo^ Of the 11wc«3«f' s 40^ 
bear no re8«rf>lanee to that of the spi^ker, ^md aai^ In gmieral con^ 
tain far less detail. , , 



B. Theories of I^Bjsiage. 

According m Morris' theory ttf «*pw <^» 4:he encpdlng and 
decoding of tl«i3^guege ca» Im sevai»^l£t«d^ latx^ thpcw^lwr^. fHte «lr«t 
level is the eyitt^attdic witl^ deal» w^eft- the iwl««fetBd8h±iw of signs 
to other signs. OA sysKsct^ic stia^sls, taEKt^iagi^aKils ^ limt^^ 
classes of w»rds, can yl^ld st3*ucturlngl8 <!£ wesvapwiMiitch ItKlleate 
comnon processing features. The second level, sSBMR^lM^^pna lysis. Is 
concerned with the relationships of signs to the things they denote. 
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A third level. pfflWtlc analysis, Is ccmqei^ wit^^ t^ 

Out ^^ will deal wlt^ ^1 ^t^ee Jl^^ of ^t^s^ ''i?^-:fv?^^-: ^ 
mary eimhasis on the xelaticm of ,tAe BeM^^ to 

the generation of discourse. » , ,. .., ? : 



■■•t 



Many theories of syntax have been developed to describe the 
structure of English, and many of these have served as bases for 
computer pr<;^raii!Sjwh^ch pef fMp ^jnrtwt^ ^ta^is^^Jpi^fi, ^pmplete 
survey of fitudt' a^aiseam immMm pmpe^Aq^WfstjBaiide^ii^ jmiiMaab:- aMi 
of these. :;^eortM^'Q^D^ Be- 

cause thtey ignore ffluK*:« ^rtfKMEbwtt Jiftf«afc <ifr4a^ based 

on such theories often yield many possible structurings for a single 
sentene«i sAlrfi da- ua«nM«uKH« t» a pHeesotn^ ^jjlittoi^cwttiiteft of meaning, 
many :<rf th« liMnting^**^ ^aabi^amm iiiai«lcpx^l^'^ii>omiiwu^^A>»i9i-ii»i^oa1^^* 
For 08 good.iH.(Ku««lfim «f «hy aiftWtMJiiieseSaEtftfc ©»l«smfea<^ntc ai»lysi« 
see fKimonawt QK6t!ingeK.i25);/w :,:,,:; ■-;-: _;-/■•<>;.:, .40 ad? '.no 



■*f,'::.:"nir 



prograiBS bteVe.:Jteen terifetsai iWhdidi i»Btt(W«6ft*:«pntta«ft;tofti:l3ftfeori;ec,fe &>- 
glish- sjMiCen«s«s>« Iii jwst ifeBsefer «l^ *eat<»ic«fc^^fi^ 

dominately meaningless nonsense. The coherent discQWMl*^eo«i7atoi^ O^f 
Klein (23) is the one exception I know. Klein utilizes an input text 
from which he extracts certain structural dependencies of the words 
in the input. He then generates sentences and Ji»|i«^jJ*j!Dr.«^ 
leased for J outpu**, m ipcka(tpp6s»aaf ^vstmiaa =w» Afe ti» ptitd« lim the 
genea»Cft! aittt eaete adt ia^itf staauAO^k Jdw pw n rt m t imm v»iifmk»tetm vi^ 
those tmiad in tltei^ia^iit !toBflifc.k; ffiiiMW«f^i-*ir«rtis^i^ii?J^^*» x^^agtoaw^^^n^ 
attenq^ :l8 iaate #0 >«taftr^li»jilMta«:ly« aiM»tog <»£ jHQfvpoed, .6aK»pt: in 
so £«»:;«« tills' ]i»imia^(ift«e>'ileet»d iiisiJt»»^aafi9i£u^f«iic^ftficjtl^ other 
.words. ift''tbe>ijiput,jfeiaaBfe,c; .'--.•,--' . ■.::*;" . o^:-k'.;5',:* 
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:';-v-^^^jaif53^s'^;*s«^^.ii^^^^ 



^ Some theoi^esjAi^do ^c^^ , 

being dBvelope4 po.^ P^i^^t^ ([^ atrt^ .|^»1l,l^*|ffy^f??f J*?!^, 
developed at the Linguistic Research C«iter of the Oniversi|: v p^ l^- 
as are an esplication of Morris' theory of sipas. Ibou^ not yet 
iioplenent^ rte 8«»s^c, ai^d^ jp^^^ji^pr^- 

liminary phrase fltructtt|re ^^a<pti.c_,^^fl^f^^, ^ i^ppd^^ |^ sypt|;p|^^ 
structures, .^ B^r^i^fB^ocab^^^^t^^ id^l^^; s|,,^ ^ 
semantic ccmsfants,, ^ei^iall^ ^^ff^f^.J^h^^MB^'^M^^-, «^l^ 
have the saae iii^ani^ ttl^^i^s a fype 4>^ C|iia^^ , 

structures ia texv^ <^f^ tt^pe^l|k^.^li|^ J^|»t^|^J^ ay^ ejsp 
plicit iKHiel of the w^l^ld. lip provJ^onJ.f^jpM^J|^|he^|hf9 ^o^ 
deduction of inforiMtioii ijiylic^^^ 



:vfDl;3. 



Lamb (26) also has proposed j^ ft}:|^t^|J^| |^PI^?|jgf/^:i, 
mar, not yet iaplanented on a coiiiputer, in whiiA successive levels of 
analysis are pe|rfo«ed, i^f ^ a ^^h^^mfflhF^MivP'mdfi^B^'''^' 
tures in a "sememic'' gtjiituii pf thf Jf^uf^ llSbl^bf^Sf^? »t|^fr . 
turn are WBdlM, of "|«a«," ^ ^^^i^, f^,M§m^f^s9ii^^Ji^: : 
lation«ihi)p8 between 4ifferfntt. b^l«-|,,^|^fi^^»ff>ffnfffnlf*i|c^ 
™ean the ?a>^^thin|8 should ipap Ij^^ Mg^^sjr^^t^ ^_ thiff.^e^ft^^ 
stratum. Se«?m^ structure? ay^ thMcuraoi^Cf^ ^|^^e|t#^ippf^pf 
meaning. 



Ihe theory o?,,lf^f»ff8e^^^f«t|«» jnd^^l||^^ 
describe below is designed to handle what we call c<^erent discourse. 
A discourse is a, sequence of septences such that .the neaning.of the 
discourse caaiio6a]be;dc6«rttia«dQ^oti|ftai^t^ii^ iide- 

pendehtly; dlgregardlng eh« 6tief ^^^lil§r|n"^e iigcours^r |he 
interpretation of each sentence may W dependent on the local con- 
text, in the sense defined previously. A discourse is coherent if 
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it has a complete and consistent interpretation. Completeness im- 
plies that theif^ Is no sufcsttini ^i^tSitn' t^ does not 
have some tlitetpiretjation 'bi ttie nwdfer^f'Vib^ sltuattonijeinfe built 
by the listeine'r. 

A listener's ability to build a moilel of a situation from a 
discourse is depefietent on in^fomatlbn avalla^^^ to liim from his gen- 
eral stdre of ^nofrledge. Theirifore it Is quite pl«sible for a dis-^ 
course to sefem cohetettt to one lisfciBnet akui not another. A writer, 
reading his' owflVrltihg, may feel that lie^lias gieneifited a cbfteren^ 
sequence of sentene^s , but in factV it' is**incb^erent to all other 
readers. Thi^ is^ untortunately, not a rare occurrence in the sci- 
entific literature. 6onvers4lyj'a' listener tSfftb is a psychiatrist, 
for example, may find coherence in a sequence of remarks which a 
patient thinks ar^ eritltely unrelated. 

The STDDMt Syltfem utilizes ah expandable store of general 
knowledge tb Mlld*i modtl of a sltuat ion described in a member of 
a limited cl^Ss of disbburs^S. tte fbna oi tfcis lioael of a situation 
built by StTOMr will be discussed ih detail in a later sectioA of 
this chajpter. As fit afe I know, S^fe)feBrr i^'the only Computer im- 
plementatibti of a theory of discourse analysts now extant that maps 
a discourse into some representation of its meaning. When the theo- 
ries of Lamb and Pendegraft are implemented > they should also be 
able to analyze this class of discourse (and others). Harris also 
talks about "discourse analysis," (2&f 4to€ ^it liiff use ot thfe^^ 
he specifically excludes the use of meaning, 'stating: 



"The method [of discourse analysis! is formal, depending 
only on feha occuBEence of ^notphenidstag di«Ctigtt<slt«ible ele- 
ments, and not upon the analyst '^^itnow|§4g^^9g|h<e parties 
meaning of each morpheme." 
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D. The Use of Kernel Sentences in G»fei^feti^y^ :^; ::l :> 1 -. ^ .: ; . 

A basic postulate of our theory of language analysis is that 
a listener uhdetstariiis a disceuW«i % t^iffi^M^ie^i^d{^ W 
valent (fh nfeattiiigy Sequence of 8ifl5)ler kferitet' s1itLtigriie66 .^ A WfeyfteJ 
sentence is one w4ilch the listener cai'tiii^if^tiisi'rfli'eetiy^ that 
is, one Iftjr which lie knb^ a ti^nsfotAatibn ihttf'tttis lii^imkt^lori 
store. Coitversely, a speakier genetitfes a s^%f IffetnMiP seWtTfeaces 
from his information map, and utilizes a sequence of transformations 
on thi» set to yt^ Id his spoken distdurBe. tlfiif f iit^^jf k^fel Sen- 
tences is not invatiatfti fecoii t^iBrudii tb t*fei^i>ti,' at^l fe^n l^irtrjeit fot^ a 
single Individual as he learns. ' .; ;s s | i 

The use of kernel sentences in ttils way is' ctyntroversial. 
However, the theoty is proposed ai a g;bc^ fraiw^i^ fHi tm^eirstanaihg 
and implementing language processing oti a bon^uterj not hecieS6arily 
as a model for liuffiah l>ehaviour. TThe u^efitiln^fes "bit tfeis tlieory as a 
psychological model is an empltic1al'qtteitibli.^*Skihnet't^^ has '' ' 
given some psychological justiftcatibi fot airttiitiil tl^^istince of 
a set of baise sentences, and ChbiiB&ky (7)^liaifdiiti^idtKir linguis- 
tic merits of the use of the concept of kernel'' s^intfetiefes.fiespite 
this conmon concept of kernel sentences, in practice, our use of 
kernel sentences is different than that bi^«fclr^ii'4r-Ch«be%, 6ur 
use of kernel sentences as a basis OJE a laft^a^eiS" analogous tb the 
use of generators in defining a group. 

Although we are not proposing Obr theory a 6^^ ibasis fbr a psy- 
chological model, it has been useful, to avbl^e£tcuml6ctitions, to 
describe the theory in terms of the properties and actions of a hypo- 
thetical speaker and listener. All statements about speakers and 
listeners should be interpreted as Te£eieitixtg~4^ W i ^^e r^i^oigtams 
which respectively, generate and analyze coh^lent (discoursed 
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E. Generation of Coherent glftcqya^,. ^;::V --::><. ;L";,J ■;-,:.■ _:>- ; Jtl' . 

speaker JS»|^.. SOTO ||0^^^^ ^ ^.:-. 

shall not_ be CRij|tf5i3W*^l^eiHJL|^^.,h<|l|;j^^ Hfi^ .^^^ti,,,^r .J,t8, ex-^,. , 

act form^^,,pj||^^ ^^^pi|R,^^i;,,|*l^,j^ , 

language taf^^ 1^ .|:|iey, ibu|U: allji|^?t^,lilf |^9PP«|^^» ^f?^^ helow. 

Th? J^i^CjCpmp^i^Mi of tOxe^pod^ i»|:e,A#^o%/jl^p<^».J^ i., 

positions ip^i , and a set of senantic dedi^^vft la^^. r^ J^^g^ . 

F. is a laapping from ordered sets of n objects, called the argu- 
meflts of P? , into %hfi, set ol^obiftpts,. T^%mmfU>&,W3^ }*^ m^h^ 
valu(Bd«id,i8^|defln# ^^,^,pi^^^gfs^X:j^^^^i^^f.^^,^^R9^^ ^ 

ditions, associated with |^^^ ,jA ,*^°fl^*#.*fRj^:j|f ^If^l!???!^ |?*f''^i^*^*? 

in a cjLass ,o^, pb^ept^^ l^^t^ |.|^_^^lj|^)|[ijwre |||e^^|^^ J^^,.,,.=A ^B-„„ 

lation rJ . if pr-f|>ef;^^, |p»e ^Sjli f^^rfif '■ M l^f*^^i *?#:.a?*¥^^^^ ^ 
of a J.abel (9 unique, lde|iti#|jpr).,^ aiid:Jtst(^dei;f4 im*; ,<|^ A ctwdttiona , 

_ . _ - , ., ,^ ... . .- ..... ^ .- ....,,, .^ ., „ 

called jthe,afgtf^»t.9p|«4.ft|^nf 1^^^^ 5*1 f®" 

lations^^re ^ain.,|:i^|^n8j...,^ . ■ ■ -v,/,!^,; 

An el,y|i^^,fyy^yry^jy^jy.c^ conf^at* of.,?, 3^>e^ ^qRfa|;M, l^th, 
some relation, iJ ,,ji^ aj^^Pj^ed^ sejt pi^^j^a^i^j(^t§ j^jf^i^^^U^.th^ 
argument conditions for this relatto]^, ^(^ ^f,t^4s3ii.p 
positions as the beliefs of a speaker about what relationships he- 
tween 9W?P|f|h^fl^ poj f ci^psf ^f ? true in. thfi. y>rld . fiapplfx JMCOr 
positions aye JL9jg^^(?#i| po(rt>lJ^Jt4^on8j(i iSW^ ?^jflf™^^^y 

propositloji^Wr^^^. , ,, ,. , ,. - -.^ , ^ ,,_ .,;. .., ., 

propositons to tbe^no^|l^jb|||^9n^|^p, jp|:<^^i||^If, 

In addition to the ordinary rules of logic, these rules include axicms 

about the relationships of the relations in the model. The semantic 
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~Ate9'mif*f--^i^ s^ 



deductive ifuleB also iztclude liadw to the •esAec o£ the ipeaker. Fdr 
exaiq>le, one such ^^luctive xole £ax i«Ubli% v^txipfKMltpa jEo iblie model 
might be (loosely speaking) 1.ook in the real woUld and tee i£ it is 
true." These rules essentially determine how the model is to be ex- 
panded, «Mi are tiae mwt eo^lex {t«rt^ of a^caa^bKelDe siyfttem. »«»* 
ever, frooi our pr«weitf poiitf: o^^vies*^, ilftiiieed (mlytEoHalder tl^ 
rules as ai black boot vhii!di.can eiictead £be a^^ odi i»raq^steion» In the 
model. 



A cloged^mestimt is ar relational li^l^ for^^onft Jt? and ata. 
ordered set of n ca>ject8ii f The aim«r Co tbis upauitiaa is affirmative 
if the proposition, constating of Idils label: atidntAie'h objects, is 
in the model (or ca4 ba adided to 1±>.! if tibe w^amixia p£ tttim pro- 
position is in the model (inr cmas ix milady ^^iim^mp8m& is ne^Cl^. 
Otherwise the answer is uxidefined. 

An opmik mttstimt consiats of a; relMtiiaialT liriwl for an »-arg!Li*- 
ment relet ifio, ft" , and a set of ofejecta- ootre^KKMiiiig^ to n^ of theaB 
arguments, where ttlidcitl « An cnawer to an^iofpen question i& an or«- 
dered set of k objects, raKOi that if thesei jd>|ect»;are aa^oclatSd 
with the k un^ecifled M^gianents of Jl^ > the resnlitiaa^fflroposltlon is 
in the model or caa be added to It. An open question may have no 
answers, or may have one or more mawera. A : jaodiidk>n ; is an open 
quest icm wdLth kp«l, ugA an ol^ect satisfies a cendltloni If It is m 
answer to the qae4tloB« 

^) Gcaerafcic» of Kerael Seirt:aneMt.i Vehawa described tbt 
logical properties of the speaker's model of the world. Vt #iall 
now consider how strings in a language, words, phrases, and sentences, 
are associated with the model* torreapondlng tor tbe<aet of obj^sts 
0^ there is a set Nj_, of strings (in Bagliskln oar caae), 
called the ng^s of the objects. There is a maiqr^one miipplng ' f rtnn 
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/n Vonto /©.I . It ds iMny-cme becm*«« oiw 4)bJ«ct iB«y ha*e i^ 
than on* naJef'***. franfciurter Ji«i iMC <l6jr"*««* «»f teak into the 
same object. in the BK)dal. ^ -r, s? -• • ' 

Recall that furactiotts map n- tuples ofobjtfiaw itfteo dbjectB- 
Thus a functioo iiame^ and an n^tuple -c«n Biw«^% an lriJ»JiW^ 
can derive a naro for this object ftraa Chie funeteionMMMBe and the 
names of its n arguments. Associated with each function is at 
least one linguistic form, a string of words with blanks in which 
names of argumeiAa. of the fuaction mist be inserted. Sxanples of 
linguistic forms associated with a model are "nundttr of -'i 

"father of ______"» and "the child of :r : and ■ ■ ■-• ..-../ "•.- There is 

a many-one snapping from the set of linguistic iomw hi^. lohto the 
set of f mictions. Two examples of -mulfelpia linguisute Setms for 

the same function are: "father of '^f«Mt " '» father"; 

and " ^plus " and "the sum of and ". Thus, 

if objects X and y have nanes "the first «B«*e«" ami Hthe second 
nund>ec" and as80cia|:ed with the function " * •• is «he llnguiatlc 

form "the product of ^__^ and _", t^en the nlBBe^of the object 

produced by applying the f<mction " * " to x and y is "tshe product 
of the first ntmodier uid the second number"* A parsing of a name 
thus must decompose it into the part i^ich Is the liiig^istie form, 
and the parts which are names of arguBMKits of the eorr^onding func* 
tion. We shall call objects defined l«i texiu» of a ^Mactltm and an 
n-tuple of objects a functionally defined object , atid thsssje which 
are not functionally defined we shall call simple objects . Simple 
objects have simple names and fuTCtioaally;*|rfitt€# pi^eats have 
composite names . 

In addition to linguistic forms associated with functions, 
there are lic^uistlc forms associated with relations. For an n ar- 
gument relation there are n blanks in the linguistic form. Examples 
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of relational linguistic forms are: " equals ", 

" gave to " and " _____ speaks". It is this 

set of linguistic forms, corresponding to' tlhe relations in the model, 
that serve as frames for the kernel sentences. 

In a manner similar to the way composite names are built, a 
kernel sentence corresponding to an elementary proposition is con- 
structed by inserting names corresponding to each argument in the 
appropriate blank. Sames nay be s±apie--t>r-asH^ of 

a kernel sentence for a proposition built from such a relational 
linguistic form is "John's father gave .3 timfes* the salary of Bill 
to Jack." which contains the siioaple names "John", ".3", "Bill", ' 

and "Jack". It contains the functional Tinguistic forms V^; __'s 

father", times ____" and "salary of'" "'"''''"' "' and the rela- 
tional linguistic form " gave ' to ". 



A kernel sentence corresponding to a coi^lex proposition 
is constructed recursively from the kernel sentences corresponding 
to its elementary propos it ional constititents by placing them in the 
corresponding places in the linguistic forms ^' _____ and 
" or ". ''not " etc. 



II 
_ » 



The kernel sentence corresponding to a closed question is 
constructed from the kernel of the corresponding proposition by 
placing it in the linguistic form ''is it true that ______?" For 

an open question, dunmy objects are placed' in the opfen argtmient po- 
sitions to complete a propositional form. I^es'e duraay argimients 
have names "who", "what", "where", etc.; an^ ^Aich^dimmy objects are 
used depends on the condition on that argiment position. A question 
mark is placed at the end of the kernel sentence constructed in 
the usual way from the relational linguistic JEorra and the names of 
the arguments. -i 
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In generating a coherent dtscpurse, a speaker chooses a numr 
ber of propositions in bis mode 1 and/or sobmb men or closed, que^,- ^ 
tions. He then uses linguistic infomaticm associated with the model . 
to construct the set of kernel sentences correspoadlng to this set of 
chosen propositions. In the next sectioi) we will. djLscuss how he. 
generates his discourse from this 9et of ken^ls. 

3) Transformations on Kernel Scntgnces. Hie set of kernel 
sentences is the base of the c<AjBreiit discoorse. The meaning of a 
kernel sentence is the proposition into vbich it mens, and a,imi~ 
larly, the meaning of any ncmie is the object lAlch, is its image, un- 
der the mapping. To this set of kernels we apply a sequence of 
meaning preserying trfnsformationB, to ^t t^e final discourse ^ 1(e 
use the word "trans forma tTon" in its broad ^aerfl sooseTlabt in 
the narrow technical sense defined 1>y OaxH^y (7) . 

There are two distinct types of transiJfoxma tions, st;ructural and 
definitional. A structural or syntactic tranfffprmatlon is only de- 
pendent on the structure of the kernel strijogCs) on whidb it operates. 
For exanqple, one Sjmtictic transforraajriGp takes e l|^ri|^l in the ac- 
tlve voice to one in the passive voice. Another coodbines two sen- 
tences injto a single complex coordinate sentence. 

One large class of s^tfctic txanB^rmttpi^yoB 1^ uspd to, sub- 
stitute pronominal phrases for ncm^s. PtpiMO^^ 

ordinary pronouns sucK as "he", "she"^ or "JLt". They may be refer- 
ential phrases such es "t^e latter", "t^e foiawi;", or ^tp pliant it^jf". 
They may also be truncations of a fvill name 8tu:h as "the distance" 
for "the distance between New York aiul L09 Aageles". In cases idiere 
such pronominal reference is made, the coherence of the final dis- 
course is dependent on the order in which the resultant strings 
appear . 
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The iBecond typt »£ tMiM^oziMtiJMUEt 4^iitiitiUfi*l^ - 

volves siiteti4:ii^tb>QS of^il^nsuistlc^ wtriags :4Md ^^a^:i§ox--imms.af^, ■.. 
peartng ia rthsr kwdael ^eatonsses . 3 lQE^^iadple,i;3&*/x^y 3q«tfiKiettme ,of 
"2 tiiM«<" w» liay ismb«*^taEte %w^ic«s'^.;^ .^qw '*A#?ytft»B*»\««ulwteitiife* 
"one haU ofc'A. itt?«diiitfc[in td3thi» sfci^^Jaed^^ ,t;ifan«'- 

forma t tons {ieiig£oimJ&>riibdilb0£i^iil:ifflt amA 'aa^ncmm^imamtiiij Fox i»CBBf>Ie>» 
for a lwg^©l rftieijfaeaMW igf tteifcoasiB Wjg im^nBO^^ttJium ^* fi-oAammme ^i 
and ^ are any names ,.,i<mft<]ie£i]:itJCi£iim»l^^^z^ftigtaHttub>aJcai»j^sldb^ 
"x exceeds z by ^." 

Some transformations are optioiM^fdBaak ^Bwaygaagy-^ taaxB^stxmyi 
if certsli^.i&nanft. «we i^esexA ia th)&:kcmBL aafe^T ; &ertleti.n. tssranfoClDa- 
tions aceus^i bya sp«rit€8r tjEoc gBjtylitttticS pucitoflieat^vJiQqr r«xi^anpLe» 
to empha«i«^ pertiAiri cM>Jeol:S);::o£h3aa: O^nfestreic XaaiiiafiBhKttimiSrsaKh 
as those vrti^cj^ iiiBiie^fpxiBi praisooKLnal. subti:^t3i&iDn»flti3s;misfid -because ir 
they decz«e9e,;t;iift d<qpCh gtf; Q cons^iructism^^^^ 



Let us review the steps in the generation of a coherent 
discourse. The speaker chooses a set of propositions, the "ideas" 
he wishes to ifjiiJBi^initb, , He ;t<h«nf 0n£od^ ^hBOksaft 'Jbffiig}ia«e:;«4:tetngff £ 
kernel s^t^^ws^s in thftiii^mer <i«s«riyj«4; ^bftyej.! fisi idiea i^cawBS a: 
sequence of ^taaictttp^l^nfl 4efintti<«al ti»jWtfmnnMtfatonft which) sare 
defined oit this s!Bt a^ fceriiels ok «*? t** aKEdajaBl? setiifi ^entenc«s 
which re«wJLt f jr^OTJi ^plicafcions #f. th«: £ir*|^^JC<ln»A«1nat;4t}tWv ,, The' 
resultijig se«i«w«^ of #«f»t«wses^ irill bft,*; ocdtttiantl^.JHscourate; to is , 
listener if. Jkpw knowt iklll t*e def inAttoiMdIiitxfiwfeMMdjiiOhsi- ^p4>]>ii^. t a 
In addition* j^or «r»ry ]>#irro£^difttin(;jt »«BiEMiiii4d.<da;ta»r«i»iB(akei!^^^ 
back into t** «a9«iPbi<^t>rt*e limit*i»er/iiia* :«JdfiB^^«WI^ 

object. ,, ■r-::;:.l ■■; ; - >■ ;■- i Ui ?Ii-- — -^ ■ i<-- ': , 

In order to4ei*rify this theory* we isiijew* In. Aptpen^ilc Bi a , 
sample semantic generative grammar '(rtiich will generate coherent dis- 
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course xxaieTStanda^la bj Him SSUBW^ mtsa:iymi» pVf%g,imm- Zha ab< 
jects are numbers and t^ie £«afcei|3a»Jare the aiKOt^inl^ 

o£ sum, di£fe3eem^rW(^>^^^'^^^^*P^ 

the model Iff ^mmer Icari equaltt^. - i^e ' ti^iasfoxuiMr^&itiBl >^sre deserily^ 
informally; "finrtheii lltig^jrfklEie !ii«r«8e4g^l3!aifd« 1n*^ bdfore a 

fonoaliKKtation for tsoap&xciBeetldffiS^ eas:^eidef(^i;4ed!t)q(»aM''^S!^ 
to the^ granmar is a sn^le^uobJLeni SPi^^i^'^'^'Bd ^ iJtsLMtftn^tihf ft; gram- 
mar. This iHcbfelCT* la'BOfl'n*»lei»y th» SfnSBil #y^«Bi« r a 



F . Aiadysis' of Cbberei^ Magouc^^ 

Oeneratdoa sf£ cohereixt discourse c(msisl:s<^i tsmx dlisei^guish-' 
able stepswi i feom :pjsopoalti0ns in Tl*ie"«|WBaked:*«*iiodei ofe'the'^world, 
he generatea^an oxdeced) 'Set x»f < i£«3ri^ tsetSteilAaiiSJ; < ^le rth^^ «ippM«8 a 
sequence of teahafenrmatdona to* thiSKkerntel JmoMii Itoei^eiWilletiig dis- 
course is iarrasctod 'aessa^ whiiich is t»ii}i^'analyaSiEtd fat«it5decba^4>y a 
listener. The listener's problem can be loosely characterized 'as an 
attempt to answer the question, "What would I have meant if I said 

that?" /,- -•■■■■ •• M'-''^^ -"i-' -->-v:i-:; ::■ ', i 

^o analyze a disooi^se the 14»t«trer Mttua&ifindJisfte^fet: of ker- 
nel sentences fro«t4iloh it was gen^atfed; tfttM *3y td dop this ts 
to find a set of invsrstt trSnafDiraattot^ lilitlfti fAffet^sSf^ifedJsto the; 
input discourse yield a s^uenoe of ketnel S^StftSWtStt^ Hie list^helr 
must then transfonfftheflBe kernel setttseftc^sB «o^ to a|>pr<>pi«tate t^ep^ 
resentatloa in his Inforttstiion' stor*. gfte apjpir^>jf«r€rflSBtte«S 6* a r*p- 
tesentatlon is a function of i*at latSr • useJ- tlieJ liisteiSia- fesipfectB to 
make of the Inferrmatlon contained Im tMd^^ 4iJBiC(»irm. 'R'^ listeitter 
may siii^ltaiieoualv ti7aas£oz«i a glven-k^nel <i%nt4^Ei«^<liA;& a «(raDbet- 
of different representations in his information store. On a level- 
of pragmatic analysis, statements require only storage of information. 
Questions and inperst^ves require appz^riat^'reSpt)«iaeSfr6Hi' the 
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listener. Tbe dtfMculties In analysii^affcKotaftiye'Mto ffio*^ ' 

associate* #ith-Tl»*l^r t:h«'Si*ft4l-iili*«nii«^^«fi«4* iffe t?he l)ii* df- ' *- 

the •tflBtett^e^^ HiMl th(6«* iis^lited »#ii«^#i^«l»fBltn§^tR«s ktfr^I^^^i^ 
tenc**' in«?6f re|«%aen*atl^WB'in'th*i'4^ ehio^ '-=.o:i.i 

M«lwi(* (29) has 8«gge»tea «*hit! inilfsii^^i^si,^ 1^ ' 

synthMl^i "A 8*q*^cr-«ffJk)^#Ael'-'i^ntiilCeit****^*^<i^ie^*tenl* 
foraations airfef chftAeli| *tid tfii'«rifeif&«ftatl§ii-ifi S^^ttiA t^ ^* leer-? 
nel ie«#^acfed*i --fhi^-r^fiillfeiiit^f^iifeSttfiif'fi iliecfi«!d%g#ffi*^'^«»^ -iii^ei " 
If they are the ia*e, tftfese keifftel '%%a'(tkl^&r%M tifiMtfoitMieionr give 
the r^4irM^ attaiyS€«-of«the-itt^ufei-'2ifS^t, -^t^^^iJi^iis^W^i »d;«tAt''- 
the re4S«ilti.ng<'dlis^^r«e'^t»ecdi»S8''^ire! lik«i'5ihl'-'li^«v»'^'' -'^' '''■-'' ^^''^y-^-^-' 

If the kernel sentences and transformations were chosen ran- 
domly, this method would obviously be too inefficient to work in 
any practical sense. However, by utili8ing^Mii6tlfc«ltoaL!«3Bi.fi^ ^ 
d iscoursBy /th«neM6iee of ke#fl«!l^ Mid ^t#«n«£^ilif##£di#>^# ^m ^p^ely 
restricted; Tliisxt««lin£4u*<'«fe#iMlt«i^ 'iim.Vfi^'M'-'^^^mmi ^^ei^ntdtf ^ 
in a ptogram being ir#ftn:ltt*C<«lHUi%1*anJ^^^i^ ? ffu^^ 

technicttte lias itbe^^dvottt^ttge ftttt e3^Msrl^=^Se ^rtte'-^NBttliar caif 1« -^ - 
utilized for-:ibo^-jJW*ly**s ■ *nd -g*ii«Kft^tt;'lJ*''«!&«jbtf!^. "■'--"• ■■ 

A more diT«et afttflytieal #pi^rosi^ waa^W «j*il4il«^ *et 6f in- 
verse analytic tran0£»SB8ei&}it8. y-%'^^MB'^i.»'''mis^eeimUietai^iiiiiri' t^rtt i«y^^ ^ 
be-«»edv:in -gmiersitii^ a -dtiivooiIrM^, wi* l5i^'SyjJwi^Wwheip«.''S ^nfl ¥''''iwfe': ■■■ 
sets o%i9aa?i«Qc.9mi^:\mtt^R^ toiS^ : 

verse-of ::T^yiis«i4jottlyi If •*^fV%i-*' ^ ^'f>me=*h«^:€ri*«i*fe4«*iJ tW*&>iJ '•■-■ 
verse transfonntftiiKi* tur *ff\y atal t^«» ca?iteri4>f»^»*C3 a^pip4ic**lcWt J 
may ag»to be/x:«ss^pt[^fl#: cby '-'t^tlsihg' ltl»£eti^dsi -iseisewnk^^ Vit^'? ■ t :. -, ; ■ . 
features jOf tire^4t^g|^v^^;'\.;^ ^:<'';;; .;'^, -^ --■.n:;;-.-Tni:ii l,^^'..,: --,;•:: ,; ,<\,:i : _ i 

Once the Ndses^CHi^ kernel ««ttC«a«s*s flW3*5|^v#(k ^R** 
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course is 4f|^ffi?n|ia«4, „|^!itf , jewf Iff jthf ,gr0^1f»; 9^H#Rl*«4^n«iTep-roi j 

resentf t:i(»%tdO|E'#»^»« PftfWn?!lin|rfn<l»ei li!rf*fn«rJ8;?!il-P^ll»«f*i«W n^9S:Pr 
ThejB«^o?3|ffqbl«i!»49nf?WWBA4fHiRgi$h^ tlifS ^99^9Um 

of those words whicho^i;§gPf)f;^}e|xMS«»4/l^i«?;^©JWP3lM&f#if**Oipfr,r ap^^ 
those which are part of a name. This is difficult because the same 
word (^,ei!^{?f»gi|4|ffeigt?8jBil|o|> e^wpynbavfrs^wiSi-plfjIfief fjnrp^ |.a9S9#g«N 
Haviiig^ilgaif^ttii, tilif..ieia 

thft A^tmsBita P^|;|bJ?« 5€|3latioB,aaaSi6«Si»hfBjaB*4y!Pib*l»#Dni^ 1^ i . 
termajpif ccpppBH^»,;j%^Jkc^:>^i;(i fi!m§|4eMliiii»4P*iSif#»P8 9S#D^b#ci 
whicli are ^)94j%1^, :?»«»«%? l^«cwi»-|bi»:iB«rf iMs*»9$fCTS, ^Bfeiatio^l rii 
lingpiiftt^x^ ^M^)^ , vf m|<^i«nf 1 ^ WsOT|8t|.§i;|f»pai;if iM|o8i»t^p58«ini»i j *fee 
discourse can be traij%|^9|ed irn6«|-a Gaii?in4s#i >5«pf ff #©$«» Aop Ipi $b« ; 
information store of the listener. 



G. LiMtifA Dfld»Cfciw» Jto4»tft. l . . -a /d .w-voH ^er.-.: I.:v:/-.: ,., 

imp Jji, Uh^t sthrS i|iip^iiB«q[taftfe^ .^fe tillftc4l#«iO«^#uJ«n*iiii»feflnnatton3 - 
8t(y59iiis ewwMtiUiaj^r4^<aw*rBMK?.*P^^^ 

at least |oii,J;he,i»fe«ira%«!f(^ iSm i|i«*M»«.y3lwpB«*e«fc»-Ti 

tion must preserve «44-fe«&»TOatl«OttowsJ^ ^ ^^ 



Xf tthe Ust^mei- As only .inft«r«^^ iiaiOfif^kto:j«^?*ct»iofc tthe 
di8coui:»e,;i»e7 n««»4««o«ii^ ^na^ew© iiniBa»tiQ»d»«lMr«<Mli*oihi4 i^rttsemeaty 
and di.«cawa 3bfee ar^fc*wH£tbin.m« «»» ,«»feTi«fc*«t*«»fc |M :5i*»«»«e»* » ^»»^- 
el is ispiiip^hli: to |:h«>iBi«M»fc«is'fciap4ai>^^^^fi^ «• 

vant ded*jH*i«»i«ih*clte5amrbe>«pde 1^ ^fawl )«^«c«]| i|rtrtth*f»ka«is fff ithea . 
di8coura»,jPWfcBa4M^b«i»iirfet*yr*fee; Wftt«WWi,!lcfa^airiitr«*i»f«!M«iT«»*^ - 
interest, t*«i lla«i«M«i«lU^J»im**l%iitoJ8aW»^^^^^ ««% 

call such restricted information stores limited dadfq«tefaH»tMtad€lfl« j ; 

The q4i«8jti(^k<-f|iiswex^^i9ig2{»»:«Eaia0 aiiJdtoilwi55'*»4dMPb«isl<; *i»d 
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■^:-?*;fi?sK!:r%*ii®^s^;i3^e^i^#te^?s^^ 



the STUBE3ST system, «i4uti lice LisKLt«d: deductive models, for the 
area of Interest la e^aho£ these ftogskma ttevetfis a '^natoral" 
represjeatatlon for thci liafonBatlon: 1& the ttUowable^ Inpato. these 
representations were aatorel In thet; th»|r .f «icdlltal»d^ <sh» deduction 
of implicit lnfo(Ematlim. -For >examplfr»oLliidsay's ifamll^ ttee fep" 
resentatloQ made It > e«« y; to eei^te the relet lenehip 9<f aay^ two in- 
dividuals Is the tree^ Indep^ndimt'io'f thenadadMZ'l of :8eAtence8ii»ec<> < 
essary -to-: build the-tree^. -.'' ;-. ;,;i'-.-: 

Because the number of relations and functions .ezpxresslble 
in the models in all three systems is very limited, there is a 
corresponding liraitatlon en ; the nen^er of ; UngalAt^co f onis that may 
appear in the input. Iliis^eafclysiaq^lMiee the psx^B^-Bg problem 
discussed eerller, by restrict Ing altevBeedveS iMr wordsrfin the 
input text«'"' -.• 1 i---. 



H. The STDDENT Deductive Model. 

'nie SIONn^ s^tem is m i^M^>Sl^J^ o£ <:he aaalytif^jpor- 
tlon of our tbeotyl '^tlDENT performs Ifeertein Inverse tT-tttSfoinSatlons 
to obtain a settof Icetliel sentences anfl then transforms these kernel 
sentences to expressions in a llmltecl cleductlve model. Utilizing 
the power of this de<iuctive model, within Its limited dOTwln of under- 
standihg. It is able t:6 answer questions based on Infozmai: Ion im- 
plicit in t^e inpu| i^ormation. * 

The analytic Aod transformational tfchnlques,j^itilized in 
STDDENT are described in detail in Chapter IV. We shall describe 
here the canonical re]prf8^n^^J|4^» of gb^f^^^ relj||:|o^ and func- 
tions within the model. STUKHT is restricted to answering questions 
framed in the cpptext of algfj^ra story ^robleiqci- ^^lg|brjjlc ec|ua- 
tlons are a natural representation for information^ in the input. 
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The objects in the model are numbers, ot ntuAera with in as- 
sociated dimeasibn. 3n»e only r«latioo ia the nodal jia. eqaality, and 
the only fuactionsr represented directly latheiBitoda't! are the arith* 
meticoperationaiof' addition* negflti«sa:iinttitljplieatlBtt»vdivi8ioa 
and exponentiation. Other functions aredefdaed- doi t^raa of these 
basic functions, by coo^ostion, aad/or s^AefcifeatieaiOfrcDnatai^s 
for arguments of these fu^±lon8,. Fori«BtiaHBp4*inthesoperaticMa of • 
squaring is defined as exponentiation with "2" a« the Seconi;arga- : 
ment of the exponential function; subtraction is a composition of 
addition and negation. . 

Within the coii^uter , a parenthesized preflaci aotation is used 
for a standard representation of the equations -iil|> licit. dh the En- 
glish input. The arithiMtic opearation to; befexprcs«*d is made the 
first element of a list, and the arguments of the function are.SttC- 
ceeding list elements. The exact notation is given in Figure 2 below. 



OperatijDn 


Infix NotatioB FrefiXiMMHSitloB 


Sqoalityi 


. A -,B .: . \;--=. ■r,r-':<«eCAL.A..B)' ' 


Addition 


A + B imm A^B) 

A + B + G (PLUS A B C) 


Negation 


- A OAWW ^) 


Subtraction 


A - B (P|.US 4 (MINUS B)) 


Multiplication 


A * B (TIMES A B) 
A * B * C (iraES'A B 6) 


Division 


A / B (QUOTIENT A B) 


Exponentiation 


A* (iaoM! A B) 


Figure 2: Notation 


Within the Sfia«lH:6e<Juctive Model 


■■•-■'• .■■■■• - -■• ■ ' 





In the figure. A, B, and C are any representStiohs 6t objects in the 
model, either composite or simple names .' ihie iisuM infix notation for 
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. - v*t. ^^1^^(^ ^-^^ - ^ . ^j , ,^ a* - .—ys-^' r^^ 



these functional' dxpresB ions is given fiM^^Cdmparisotti Because thifi 
is a fully parenthe«ize«t notation, rto iMbtgU^y <>f '^perfeBional brfl^ 
arises, as it does, for example, for the unparenthesized infix nota- 
tion expression A*BfC <»r its corresponditit^ luiturat'^Bli^age ^p¥es- 
sion "A times B pluis G". Note also thSt In tto» p^^ix notatitJn £lus 
and tinres are not strictly blnaity opei^atdr^. ^ItWeedy iipl^e inodel 
they may have any fimite hunger of atfjammB* fr.%. <TaffiS A B C D) 
is a legititndte expression in the SimsS^ katiisl. 

Representations of objects in the STUDENT deductive model 
are taken f r<>tt the iiiput . Any string o4 vrords aot cotitalning a 
linguistic form associated wieh tfte arltltt»eelc functions repress ible 
in the model are consider^ simple names for objects. *nHi8, "the age 
of the child of Jofctt and Jane" is considered a '¥iii^le netiie because it 
contains no functional linguistic forms aw'soiiltfte* \rtfch functions rep- 
reseitt:ed in SfUMatT's liraired deductiveHr&Kli^I^ In a more g«ietal 
model it would be considered a cCaposite nfaifte, saiid tShfe fuhctlonal 

forms "age of " etaii •'child of ____ and _" tiftjultf be 

mapped into their corresponding functions in the model. 

Because such complex strings are considered simple names in 
the model, and objects are distinguished only by their names, it 
is important to determine when two distinct names actually refer to 
the same object. In fact, answers to questions in the STUDENT sys- 
tem are statements of the identity of the object referenced by two 
names. However, one of the names (the desired one) must satisfy 
certain lexical conditions. Most often this condition is just that 
the name be a numeral. For a more general model this restriction 
could be stated as requiring a simple name corresponding to some 

functionally defined name — because, for example, "number of " 

would be a functional linguistic form in the general model, and the 
only simple name for such an object would be the numeral corres- 
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.x;m5;ii^fT:^^!^iV:^^ris^gsgS!f:,i^^;^p^^^!i^^.tt--:99-,t>t-^^ - 



ponding tp this numh^^. < An MnB\mV Gf»m%f^ Qf,r«t ^^jbpi^nt;, p£^ ; ■ 
identity e.g,"The manter o€ :C.miimi^B llqm ge!t#T i« ;i^).Jj 

IHe other Ifflcical restriction on ansffm» am^^fHikm? vm^ in 
the giSroHMC system fife 4|itil«tiBm3€ jl^^ftt «^ 

ponding to a iMinefisiojk ftft^ciatied ^M-fch a *iwrt>ei5) B^m^ i»! th* d^ j 
sired answer. For paMuupXe, saagafc ij^ij^hg mi^isAp^^Mkai Jt>y (^w^iiuesr 
tion "How many spans equals ;:l;,,fa?t*prt?"i «nd f|;^#:^^^ 
STUDENT is "1 fathom is 8 spans". 

The deductive model described hejnss is vse&tisj^lpife snw^^cing 
questions bejcsuse we knofi how* fcOrfflMMraiftj i|apiio^! ia:|€o>xp#tioP fKop* 
express ipps in this model,; that ift, we l^ww. b^-fro sojve sets of 
a lgebra;ic equations to JEind 9»HB6Bii<y»l vaJbues ^^<^r ifiqtiitftfy these 
equations . The soijuti(KLi;>XK>c#s|[ u#«^ i% SI|iBp|| jis ;di*icr4.^«d in de- 
tail in Chapter ¥3C- Thai transfftrmatlpi|,iproQ|MMBt,r ;lia«i^ (^^ theory 
described earlier , i4uLch STIPS97 uses -M j8P l«» s^ rftl* Wf«h input 
to this deductive iqodel, is idescribed in (S^i^ei^JP^. 
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CHAPTgR HI; PRPCRAMMIN G FOBMAUSHS jgC) UmSAG&imimUXWK 



Almost any pcograamlng language Is universal in the sense' e%Mit 
With enough tiiiK, space, and work at the Ifi^leioetttattbh, aiiy conii*fttabli& 
function may be programmed. Howevet'V the teSk o^f prbgt-anBri.ttg cah^^ b* ' 
made much easier by the proper choice of a higher level problem ori- 
ented progranming language. Ilie data to b^ iiiinlpttlatied by the STU- 
DENT system is symbolic, and of Indefinite lehgtih'iind conip lex ity. For 
this reason, a list- processing language was tfhe tiklfft appropf late tjT>e 
of programming for this task. There are a nuBfl>er of Wch languages 
available! each having its own set of advantages dtid difradvantages. 
For a description of the general pro|>ertlei*6f li*t*'ptNc>ces8ing leh' 
guages, with a detailed compairlson Of fottz* of t^efbiftter known list- 
processing languages, see Bobrow and Eii^*M <5) .• Ifc>«tli^^^^b^^ 
knew it so well, I chose LISP (31) as the basic language for the STU- 
DENT system. 

The LISP formalism Is very cOn^ehieiit fbr prdgtaamlng recufsive 
tasks such as the solving of a set Of slmultane<mis eiiuatldhs. However, 
LISP does not provide any natural mechaiiiiws -for- re^et^fttlng manipula- 
tion of strings of English words^ another very ili^ortetit aiA>tai5k in 
the STTOEWT system. For this type of rtlMitpi^latloh one wduld like to 
perform a sequence of steps Iflvolvihg 6pietatl^9tti('4t&h iis rece/gttltlng 
a sentence format i^ich fits a particulirr pattern, fittding certain ele- 
ments in a sentence by their context, rearrangli^ a String of words, ' 
deleting, inserting, and duplicating parts of strings, and others. 

The LISP formalism cannot easily express such string itianlpula- 
tions, though each could be individually progtiiobtMa. However, a for- 
malism for just this sort of saantpulati^n £«' the bai^is of^^ the iXMtt <45) 
programming system. Rules in this fotmallsfii can easily express very 
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ever, CCttfIT and LISP cannot be used simultaneously, aitd the problem 
context necessitates going back and forth between LISP-oriented tasks 
and iQQIB[T7orient«d t8sk». 23i«re€ore, ];^«dbp«;«il^^>^ 

Uofv ^jvsej^ IJ^f , ^ajL, ppwf |tr«f;te^ a y;p]^iiPirogyjiV/^a^f#ilfEpK» ^if* 
W0UI4 inf:^r{^rei;.»ti:is% t;ranftfpnn9^io|L m ii« <t^i#T^tot:#ti9n. 

In con^truct4liK the IPTSOit irv|:ej^rfif jer , I ^ft^f^tptiyejly expended 
the eloq^uexv^^e q| th&,|.IS^.p^pgf«wn#t^^^ tJim^Jf* operations 

vjhtch CQul4 be d^ji^ |>^|:«yl4)«^l^», b|if:^o|(cir(f| «[iric»»7# f$>Gb ifiyoKe co^ld j^ 
be expre««fs4 ea*il^. An fSictei«^edMla|^^Se ^B3?^yijpm:t^^i*<^Piii^^H^!^B 
of OOltlT aiid WSP 'ccpj^ ^^:|>eep^ buil^,fE9p,s|:rBtc|i:, b»t it i^ pw^h > 
more ec^onomica^l . ^, schipye ^ m^ f xtftpfi^f of^ ^^ fn^b^d^ng . lip r advsn- , 
tages^nd di^a4v«ntff*aL<»^ i3^ngfti*se< ,fXif;<^5y^<(C«^ .^ f«i^^id#4s>&: a,7«idiSjP4hS#«d 
in detail, by, JjSot^Tfw <fnd Ite^zei^lMk^ifi! (|^)» ^^^ , , ,. 



A. Specifying a Desired String Format. 

Mia^EPit hast, Jmw^, 4^sf^h^ i^ ^^f^ tl , «J.8#t)hpr!ei stS) j - b«tj v&, in- 
clude here a,^xlet.JUsaitB^,^,^f its, &B«tf«j?ft, Wej4.p.t:^8' ,%9C#iasc Msefvf 
the ;vQ|UtioQ,|B|^«A^J^fr,^ f^«ceft9 

easier. Xn addit^i,*^,, if any «m|i4^ity ht^m^^ *PP§m^ imi^^^^^Xftla'- 
nation of the <«^^i^iq«,o^ g;5gi|^i?SJJ,Hit3*^ l^e^ ;:e|urive4rj>y f«>niH4<it|rp« ,1 
the listijfigpf !t^&S,W^OI^iP^o^j»mj4.n,^^^ Ji%|^ii|#r, latter 

case, Itmay h^,j»<?>C!B«8ary to <:o|Ml^lt t^,B^ flHK;if i^atipn 

of MEIS^QR refex^enc^d 4b(^te^ , . 

A METEOR program consists of a sequence of rules each specifying 
a string jtrf»sf<>i:»ftl^ and «iyiiig Sfpfi., <yp«a*;f ^l i^tq^^i^io^ ,l^%,ixa 
first pops iderjiov a «|:j;1je^ tr^nsifpp«ijt|.pf|J^^^^ , 

pail thf vf tr|.ngi to b© jf raaf ^imif^ tJbft JwarfaHmce *^. 31iCr jgprfcsaaoc^.vill 
be tr4i|>ff<^p»ed^.by a, rj^e pnly if it^ 1^ fupjtlsern ;««; £«^nn»t' StVfaa 
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in the "left half" of the rule. Hits left hal^ is a list of eler 
mentary patterns which specifies a sequence of items that must be 
vaatched in the workspace. For example, if the left half were 
"(THE BOX)" then a match would be found only if the workspace con- 
tained a "THE" inmediately followed by "BQY" . ^n addition to 
known constituents, one can match unknown coi^stituents. the ele- 

"J ■■■■'■' ■ -'^ ■- ■Mi;',' ■■■•..! .■ , ,':,■;. ,-5 '■-,.:.:,{■: r is v^i^S' irt33 SsC M r4 { li nS'JX 1 f. } • ■: ■._,■.':■ 

ment $1 in a left half will match ^ny one workspace constituent. Hie 
left half "(A $1 B $? C)" will match f contiguous substrii^g of the 
workspace which conslst;s of an A followed by exactly one constituent 
(specified by the marker "$1") followed by a B. followed by exactly 2 
constituents (matching the "$2") follcwed by an occurrence of a £. 
Thus $1 will match ap element of the workspace with a spec|.fied con- 
text. If a left half would match irore than pne substring in the 
workspace, the left-most such substring is the one found by the 
matching process. 

We have discussed elementary patterns whiqh match a fixed num- 
ber of unknown constituents (e.g., "$3" matches 3 unknown constitu- 

■■''-'■■-•■ 1 L * , - 

ents). METEOR also has an elementary pattern element "$" which 
matches an arbitrary number of unknown constituents. For example, 
the left half (THE $ BOY) will match a substring of the workspace 
which starts with aij occurrence of "lMf""Ixrnowed by any number of con- 
stituents (includjLng zero) followed by an occurrence of "BOY" . It 
would, for example, match a substring of the workspace "(QIVE THE 
GOOD BOY)" or of the workspace "(THE BOY HERE)*' . If the left 
half ($ GLITCH $3) matches a substring of the workspace, then the 
elementary pattern "$" matches the substring from the. bpainnijig of 
the workspace up tp but not includinj| the first occurrence pf "GLITCH"; 
the pattern "GLITCH" matches thip occurrence pf "GLITCH" in the york- 
space; and the elementary pattern "$3" matches the 3 elements or 
constituents of the workspace i^pdiatiely following CpL.I*R3i. 
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Elements in the workspace may be tagged or subscripted to in- 
dicate special properties of this element; for example, one might 
have (HAVE/VERB) or (BOY/HOWi) as elements of the' workspace . Such 
elements can fee'match^d by Aame (uHng HAvl or iM as pattirn^^^ 
or identified just by their subscripts (or fey both) I *he elementary 
pattern <$1/VERB) will match any "ingle constituent whlcii Is a verb; 
that is/ one which has the'su^scHpt'^VER^ even ir this constituent 
has other subscripts. Thus the left halt (ALFliDj^l^iBy BOOKS) 
will match the substring |aLFRED(REAM/?H8J)B^ 
space (NOW AIJTIED |REiu)SyvtRB)^B^^ 

Other elementary pattern elements are provide!, and new pat- 
tern elements c^n be defined and easily used within the ME*BOR system. 



B. Specifying a Transformed Workspace. 

We have discussed how a desired format can be specified throu^ 
a prototype pattern, called a left half, if we try to match the work- 
space to a left half, but it is not in the format specified, we say 
the match has failed . If a substring of the workspace is in the speci- 
fied format, the match is successful . When there is a successful 
match, we may wish to transform or manipulate the substring matched, 
or place in a temporary storage location, called a s^elf, copies of 
segments of the matching substring. We shall now discuss the nota- 
tion used for specifying such transformations, and storage of material, 

A left half is a sequence of elementary patterns, and we associ- 
ate with each elementary pattern a nudber indicating its position in 
this left 'ha If sequence. For example, in the left half ($2 D $ E) , 
the first elementary pattern, §1, would be a^sociate^ with the number 
1, the second, 6; with 2r$ with 3, aiidi with 4. ""If a match is suc- 
cessful, each elementary pattern element in the left half matches a 
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part of the substring of the wbrkispace matched fey' this left ha If . The 
part matched fcy an elementary pattern can &enie' referenced by the 
number associated with this elementary pattern. For the leH half 
given above, and the workspace (A B C D B A E G) , the left-half match 
succeeds , and this substring tB C) laay tiien %e reterenced wlSli the num- 
ber 1, the substring (D) by 2, (B AJ i>y ^ and" til* Ib/^; 



•;>no>-ft i» i.i'.: 



The transformed workspace is specified' by' tl^ "riglit half" 
of a METEOR rule. This right half may be jusfc'lhe mmeral 0, in 
which case the matched portion of the'lroilcipace is^^lete^l Ofcher- 
wise this right hallf iust be a list of elCTients specl^yinig a replace- 
ment for the matched substring.' Any'iMB&ers' in tM:8 ri/giit^ list 
reference (specify) the appropriate part of the matched substring. 
Other items in th6 list may reference' Memselves,'' or strtr^s In tem- 
porary storage, or functions of any reJ^ei^enbealE/te ^i«tri:ngsV'' In 
the exampiie discussed above. If the rli^^half Wi^e^ tS" 2^1 ^' ft) , thfeh 
the matched portion of the worl^p^cewtMd'^lb*' replied' i[ryt^ D H) . 
and the workspace would become (AS :A' D^I^D W fe) . iote that 1 and 4 
were not mentioned in this right half and were therefore deleted from 
the workspace. Also 3 and 2 were in reverse order, and thus these 
referenced parts were inserted in the wor^spac^ in an ordet' opposite 
to that in which they had appeared. 2 is referMence^twice^^ 
half and therefore two copies of this ry^ei^ncwf siil)lltringl*'^!j^*' ap- 
pear in the workspace. The eliements |[ ai^ l^"i^ thl¥'ri^t halt refer- 
ence only themselves, and are therefore'" iWerte^^Mrectly' into 

workspace. ■ ■ . j-; :,.."vu^ ^ju, .,; i:",:. ::>.:: -.y,t 

Usinig the right-half elements descrlb^^ t!4t is, nun^ 
referencing matched substrings and constants (eleiwBnls referencing 
themselves) , one can express transformations of the workspace in 
which elements have been added to, deiete<i^ ¥ron4 duplicated in, and 
rearranged in the workspace. Elements td be' ad^e^ to the workspace 
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thus far can only be j^onf^tants . Let us consider 8C»ie, pfhfr posf ible 
right-half elements. They aiff all ind|.fated by 11.8|t8 which stfrt with 
specia.^ flags^ .- , 

The contents of any shf If (|eD^r«\;^y^j^tora^f^^^^^^^^ 
referenced by a two f l^^nt list w|.th first elf me^t eithfr *A (fpr ^H) 
or *N (for Next), and a second element, the shelf nsne. For example, 
(*A EQT) references the entire fontents of a shelf ^med, B^p!. ^t this 
element appeared in a right half^ the entire cpntentp of that jB^elf 
would be placed in the corresponding place in the workspace. The 
first element of a shelf named SESUE^XS coul^ be ptil: i^tothe work- 
space by using the element. (*N SHSTrarqES) ia a r^g|it hflf. 

The flag; FN as the first n^ber of p list serving as a right- 
half element indicates th9t the next menher of this list is a function 
name,, and the following ones are the argvq^ents of this function. The 
value of tjie, function for this, set of arguments , |.s placed in the 
workspace. Ip this w^y, .^i^ LIfP„ fui^pt|fn m^ a METp)JR 

rule. ...... .;.;.:.-.,.•,..,.,,: 

The flag *K indicates that the rest .of the list following is to 
be evaluated as a right-hali rule, and. then is to, be "ccnroy^ssed" 
into a list which yj-H he a single elepept of the ^jrkspafe. Thus, 
chunks which are longer, and _ have ,ii»ore conmlex, structure than a 
single word can bf treated as a single unit within the METEOR 
workspace string. The inverse operation is the expansion of a chunk, 
so that all its components appear as individual constituents in the 
workspace. Expansion ia^ ii^icated by s *E flax, at the beginning of 
a right-half el^aent list. 

We have thus far ^iscufsedhpw the. t^fI^tfornwt^ string, 
called the worksp|ice| ,ca^ be expresf ed in terns of a left half^ wiiich 
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is a pattern for a desired iiqmt toxamtd andra^ri^t half :^ihldh ia a 
pattern for the desired output format. There is nqoecasdn to liait to 
one the number of outputs from a single left half match. In fact, a 
third section of a MBTIOR rul«i ealled;i6h#'>^reu£in«&S80tioilt5 t^^r 
hiscorlaal reasons)* allows the pt^gtaalir te: give. any numbei^nofoCho^ 
er right halves, and place these referenced ^iiitisxAt the beginning or ■ 
end of any shelf (teaiporary storage list). The storage of such a 
"right half" is Indicaeed in th6 reatiij»g)s<6ti^n/1^ciacli*t tt£«rting 
with a *S or a *Q, followed by the shelf naiM, and followed by a 
right half pattern. The *S indicaeeanthatthef<tfe£ateneed material is 
to be Stored on the beginning of thcinraed fOmttM i *Qoindieates that 
it should be ^eued on the end of jtfhe toelf a u8s«icwith ar*N.for re- 
trieval, a shelf bujlt up by #k»8 is as^uihdott* list, (a l*«fc-ln- 
firat-out list), fnd a thclf builtiopiby^aif^ ifja.rqu«s»i<flr«t-ln^ 
first-out list). , , . . , , » 

The only yotber slgftif ic«ntufe«tiM!«s6f r.« lllTB8&f<tptogr«ti ithat we 
have not yet tott«b«d -on 48 tsllf cootiKQl>J«tsuet»r0lil»s«t3«et; ei rales. 
A METEOR rule l|#s a »«ie, »^ h«l» a.ffto-rtM|"ii«Cti«^* Ordintoilyi if 
the left'half match £«ils, ^oafesolnis ;«ittawii3ie«l||r passed toothe 
next rule in sequeaces , If the! l«ftBbaiiaiW|!diJ«occe«idsi.»ehe right half 
and routing sections iare iiitetpretfidi aod tlMm conbeol is p«8sed to 
the rule n«n«d ia tbei^lgo^to*^ Boweveri by^H|«rt;iOtfv«f ia i'f*** 
inmediately after the rule name. in th«rifule^i>tAe.«l!tbe4^dft^Ean»fec of 
control is switched i sad^nly on le£t«lkal£ :itti,l)UB«-^«fi;ll5<diltEe|l pass 
to the rule n«Md iajthe MgoMeo". . ^* i "^ a 

Routing control can also be changed by a list of the form 
"(*D namel name2)" in the routing section of a rule. After this list 
is interpreted, any occurrence of namel in a "go- to" will beinterw 
preted as a MgOfto" c»nt«iniag:^^ma2*^i. tlTbiafilatsteBofii^tJorc allows 
easy retttcfl ftQraLaabrOii^ineaf ^^9i^ ttaev^f olaftt^tAlfifrsacMieaa or faiUtr^e 
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as a switch for the tfansfer of control makes it possible to write sig- 
nificant one rule loops. 

A METEOR prograpi is a sequence (list^ of rule8^» Each rule Is 
a list of up to six> elenents. lte>£pliqwing is an' ex^aple of a HEIXOR 
rule containing all six- elements} 

(NAME * ($ B0¥) (2 l> ( / (*S Si 2 2) (*D Pi P2)) PI) 

We shall briefly review the function of each of these six elements. 
The first element of a MBTEOR rule is a name, and taast be present 
in any rule. If no naiie Is needed, tsheduiny name"*" can be used. 
The second element is a "♦''^^tld is optional. When it is present it 
reverses the 8*itoh-0n flow of centEol, and tratisfer of control to the 
rule named in the "go-to" is made on left-half failure . 

The third elemetit is mandatory, dnd is a left-half pattern 
which is to be matched in the "WOrk^pSOe. ^e iOuteh element is 
optional, and is a right«half pdttetn specifying the result in the 
workspace of the string transformation desired. ^^ fifth (optional) 
element is called the routing steet ion, aid is a rUst flagged with 
a "/" as a first element. The remainder af the rout iflg Election is a 
sequence of lists which specify operations which f^laoe items on 
shelves or set "go- to" values. The final element is called the ''go- 
to" and specified where control is to be passed'if a match succeeds 
(in the normal case). A "*" in this position Spedtf lies the next rule 
in sequence. 



C. Sunmary. 

In this chapter, Jwe have briefly S!ummarized;:the f«atares of a 
language for string manifmlatlon which has been 0Dbedded (by building 
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the METEOR interpreter) in the general list-processing language LISP. 
The ability to describe easily in METEOR the string transformations 
needed to process English sentences, and also use, where ^propriate, 
the functional notation of the general list-professtilg language, LISP, 
was a great advantage in the programming effort" involved In thia study. 

As a final illustration of the power of the cof^inect tJEETEOR-LISP 
language, we include a program for Waag's algorithm f Or piloting: 
theorems in tlie propositional calculus. This algoritlun i» described 
on pages 44-45 of the LISP manual (31), and a LISP program for the al- 
gorithm appears on pages 48-50. Figure 3 below conCiitns the complete 
METEOR program for the algorithm, includlngsclefinitions o£ four 
small auxiliary LISP functions used :Witliin th^ MKTKJil program. 

In addition, the figure contains a trace of the program as it 
proves the theorem given after the first line CQi^tiBipilig "(THEOREM)". 
The other lines giVe the theorems that are p:^pv|iir by the algorithm as 
steps in the proof of this theorem. This MEXEO& ^ro^ram compares 
quite favorably in both size and anderijtaijda6ility td the one given in 
the LISP manual, and to the one COMET progratftiwhich I have seen which 
performs the Wang algorithm. 
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Figure 3; A METEOR Program for the Wang Algorithm 
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CHAPTSR IV; TW^fMiA mm QF MSLISn TO ^HE SBDBKT jaSBBG^EIVE mPEL 



The STUDEST ayaten cowatats of two nwtin Simlnpz^r^s , ^jailed 
STUDENT and REMEMIBR. T^ program exiled RgMEaataecepUa and pro- 
cesaes atataaenta ifetch contain global tafennatt^; >r?a»at ta, in- 
formation which is aot apectfic to anyone 8fe®ry pi?abiere. We ahall 
discuss the proceasing a»d tnfioiTBiatioa Btor*«e.t«<*ni4u«!S i»aed 
in RS^MBER in the aexte chapter, A Hating io£ .^be global Informa- 
tion .given to the STBDEOT ayatem maybe fotsad in Appendix C. 

In this chapter, we shall describe the techniquea «nbedded in 
the STUDENT program which are uaed to traoaform an Sngiiah atafewnent 
of an algebra story problem to expre^arioro ijtithe: STOTBliT deductive 
model. By iapli<»tlon we are also defining the =8ulfflet of English 
which la "underatodd" by the STUDENT program*: A more explicit des-? 
cription of thia utput language ia given at ^he eni, at the chapter. 



A. Outline of the Operation of STIgMgNT. 

To provide pexapective by wikich to view the detailed hei»E tat ic 
techni<|ues ased in the STTOBST program, we ahall fteat give an out- 
line of the operation of the STHKaiT program ^**en given a prohlm to 
solve. Hiis outliaie is a verbal d«ac»ipt4oarOifth« :£!<»» chart of 
the program fottttd in Appendix A . 

STSDSJT is asked to solve a particttlar pcoAlera* We assume that 
all necessary global information has been st(%red previously.! STIHJEMT 
will now transform the iBngliah input 8tat«Bent Of this problem into 
expressions in its limited deductive nwdel,«i*ti^o^gh appropriate 
deductive procedures attempt to find a solution. More specifically, 
STUDENT finds the kernel sentences of the input discourse, and trans- 
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forms this sequence of kernels into a 8i|^ of gin»al^^iSRS«Mtf equations, 
keeping g list of the answers required, a list of the units involved 
in the problem (e.g. dollars, pounds) and a list of all the variables 
(simple names) in the ^qwations. Then SIRIBBFE iaw&k«s the S0LVE program 
to solve this set of expiations for the desiieed tinknowrili. If a solu- 
tion is found, SI^S^NT points the valaes of Che u*aBri<«m8 requested in 
a fixed format, substituting in " (variable IS ^»lii#> " the appropriate 
phrases for variable and value. If a solutlsdn «dna©t be found, 
various heuristics are used to identify tvto mttriahlmtf (i.e. find ti*o 
slightly different {Erases that refer to the #am« ^ject in the model). 
If two variables, A and B, are identified, the equation A = B is added 
to the set of equations. In addition, the «tofe of global information 
is searched to find any equations that may be usefsl in finding the solu- 
tion to this problem. STODINT print* owt axiy mAdmifftiOTiS it makes about 
the identity of two variables, and also attyeqiitftlOttS that it retrieves 
because it thinks they maiy be relevartt. If the ««e of global equa- 
tions or equations from identifications leatdst© a 't^ltition, the an- 
swers are printed out in the format described above. 

If a solution was not found, and certa^m idiaa» are present in 
the problem (a result of a definitional transfiorrastton used in the 
generation of the profblera) , a sobstittftion i*)dB^<le for each of these 
idioms in turn and the transformation «o# solution fnr^Cess is te- 
peated. If the substitutions for thesNS idiksam (k> not enable the prob- 
lem to be solved by STUDENT, then STUDENT reqsie»t» additional informa- 
tion from the questioner, showing him the variables being used in the 
prohliMn. If any infoi?niation is given, SlKlilllTtri^^ to Solve the prob- 
lem again. If none is given, it reports itstnabtlity to solve this 
problem and terminates. If the pr<*lem is iever*ofe«di the solution 
is printed and the program terminates* 
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B. Categprj.es of Words in a Tr^ansfpryajtion. 

The words and phrases (strings of woyds) in the English input 
can be classified into three distinct categories on the basis of how 
they ^re handled in the trans format ion j^o the deductive model. The 
first category consists of strings of words which. name objects in the 
model; I call such strings, variables . . Variables are identified gnly 
by the string of words in them, a;id if two sf:rings differ at all, t\iey 
define distinct variables. One important probl^n considered below 
is how to determine when twp distinct variables refer to the same ob- 
ject. 

The second class of words and phrases are wh^t I call "substitu- 
tors". Each substitutor may be replaced by another string. Some sub- 
stitutions are mandatory; others are pptipnal and are only made if the 
problem cannot be splved without such substitutions. An example of 
a mandatory substitution is "2 times". for the word /'twice". "Twice" 
always means "2 times" in the context of the model, and therefore this 
substitutipn is m^indatpry. One pptipnal "idionwticV substitution is 
"twice the sum of the length and width of the rectangle" for "the peri- 
meter of the rectangle". The use of these substitutions in the trans- 
formation process is discussed below. These substitutions are inverses 
of definitional transformations as defined in Chapter II. 

Members of the third class of words indicate the presence of 
functional linguistic fojrms which represent functions in the deductive 
model. I call members of this thitd clas$,,"j9|>era^ojrs"- Operators 
may indicate operations which are complex combinations of the basic 
functions of the deductive model. One simple operator is the word 
"plus", which indicates that the obj#citf,na!3^drl>y» the ,.tvo variables 
surrou|jding it are to be added. An exapp^Le o^ f*^^f cpwpll^x operator 
is the phras* "pi^rc^nt,, less than", as J* "10. .percent ,l®f* ^^*" ^^^ 
marked pxi^i^", whiph indicates that tjjgjjyiijpb^y^^^ i^ preceding 
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the "percent" is to be subtracted from lOO^, ^thi8 result divided by 100, 
and then this quotient multiplied by the variable following the "than". 

Operators may be classified according to where their arguments 
are found. A prefix operator, such as **the' square of....." precedes 
its argument. An operator like " . . . . spereent** Is a siif f ix operator, 
and follows its argument. Infix operators Such as '^. ... .plus. ... ." 
or ".....less than....." appear between their two lirguments . In a 
split prefix operator such as "difference bet^eett. .1 . .and .... .", 
part of the operator precedes, and part appears between the two 

arguments. "The sum of and and " is a split prefix 

operator with ah indefinite number of arguments. 

Some words may act as operators conditionally, depending on 
their context. For exan^le, "of" is equivalent to "times" if there 
is a fraction inmediately preceding it; ei^g., "^.5 of the profit" is 
equivalent to ".5 times the profit"; "however, "Queen of England" 
does not imply a nniltipllcative relationship betweeiftfcfie Queen and 
her country. 



C. Transformational Procedures. 

Let us now consider in detail the transformation procedure used 
by STUDENT, and see how these different categories'' of phrases interact. 
To make the process more concrete, let us consider the icoi lowing example 
which has been solved by STUDENT. 



(THE tRDBLEM TO BE SOLVED IS) 

(IF THE NtHBEi OF CuSTOHERS TWf GETS 1^ TWibE THE SQtlARE OF 

20 PE^ ta«nr OF mE NUMBER OF ^Aa^EM^EMJ^ 

NUMBER OF ADVERTlSiieN^S HE »^ 

OF CUSTOMERS TXM GETS Q.) 
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Shovm below are copies of actual printout from the STUDENT pro- 
gram, illustrating stages in the transformation and the aolution o^ the 
problem. The parentheses are an artifact of the LISP programming lan- 
guage, and "Q." is a replacement for the question mark not available 
on the key punch. 

The first stage in the transformation is to perform all manda- 
tory substitutions. In this problem dnly the tHree phrases underlined 
(by the author, not th^e ptogram) are substitutors: "twice" becomes 
"2 times", "per cent" becomes the single word "percent", and "square 
of" is truncated to "square" . Having made these substitutions, STUDENT 
prints: 



(WITH MANDATORY SimSTimioNS ^E KWBIJEM is) 

(IF THE NOMBEail OF CireTKMIERS ^ C^TS IS J TIMES THE SQUARE 

20 "MM^ 6F^ %he traiteER di? ^AtiviwllsiiDW W runs. 'M) tsE 
NUMBER df M^txiisMiiris tf kis 1ti' W, wto^f is im Nt»e»feR 
c» CDS*ititelts^%M GK*rs Q.) ^''' 



From dictionary entries for each word, the words in the problem 
are tagged by their function in terms of the trans fofinat ion prbcess, 
and STUDEliT prints: 



(Hip WBD? XAGd|?r BX^ f DIWTpC^ 3^ ,^^^ 

(IF 13» IRa^^ ^ / OF) CBSIOI^ lOM (^IS / VERB) IS 

2 (TBffiS / OF 1) THE (SQUARK^ /W 1) 20 (PERCENT / OP 2) (OF/OP) 

THE NUMBER (OF / OP) ADVERTISEM8NTS (HE / PRO) RONS, AND THE 

NiftffiiR (t^ y ^vy ^iimsensami^^ ^BE /pac^ loms is 45, 

(««AX / <pORl^ IS THE WMSBSm ^0^ / <OP|f OUSlDM^ffi 
TOM (GETS / VERB) (QMARK / DLM)) 
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If a word has a tag, or tags, the word followed by "/", followed by 
the tags, becomes a single unit, and is enclosed in parentheses. Some 
typical taggings are shown above. "(OF/OP)" indicates tliat "OF" is 
an operator and other taggings show that "GETS" is a verb, "TIMES" 
is an operator of level 1 (operator levels will be explained below), 
"SQUARE" is an operator of level 1, "PERCENT" is an operator of level 
2, "HE" is a pronoun, "WHAT" is a question word, and "(^RK" (replac- 
ing Q.) is a delimiter of a sentence. These tagged words will play 
the principal role in the remaining transformation to the set of 
equations implicit in this problem statement. 

The next stage in the transformation is to break the input sen- 
tences into "kernel sentences". As in the example, a problem may 
be stated using sentences of great granpatical cpji5)lexity; however, 
the final stage of the transformation is i^nly define4 pn a set of 
kernel sentences, liie simplification to kernelp8eflLt^^<;(es as done in 
STUDENT depends on tie recursive use of fopiat Bia|:.chiiig. ^f an in- 
put sentence is of the form "IF" followed by a E!ji^l;j-i||g, followed by 
a comma, a question word and a second substring (i.e. it matches the 
METEOR left half "(IF $ , ($1/ (^ORD( $)" ) then the first substring 
(between the IF and the conma) is made an independent sentence, and 
everything following the conma is made into a second sentence. In 
the example, this means that the input is resolved into the fol- 
lowing two sentences, (where tags are omitted for the sake of brevit}^ 



"The number of customers Tom gets is 2 times the 

square 20 percent of the manB^ of ailveiftisements 

he runs, and the iwmber of adiresrtiWWMitS he runs 

is 45." and "What is the number of custcwaers Tom gets?" 



This last procedure effectively resolve* a problem twto declara- 
tive assumptions and a question sentence. A aefcon* drmplexifey resolved 
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by STUDENT is Illustrated in the ftJ^-^at ktetfC e' ^ g^-^ifci f lM^yafe. A^4^ 
ordinate sentence cbnslstihg dt t^%iti£k^c§i'joiti^''ify'a''BoaaM'i^ 
mediately foil ttwwfd by iii **a^** ^tJe^f'l^i^^i^itiitee iltfdhltig tih^ "' 

METECiR'leftf ha[tf ♦'(I, AND $>" ) i^f tl '6^ if^ldll^ea ttitd ttieiie "two fin- 
dependent '«t*>rit€inc68l The flr8€'s*nil^*-irbd^ii 'il'fifiiri^^re t^ibtVid 
intb''tW'd ^imiplsi aititeiic^a . ■'^'' ?-.i'r> .. . .-..qi^- ^i •■.;-. 

Using tH^ff« two iiWerse syiitaccic tfranirfoMuitlbtis, thl« ptd^ 
lem atat&ntiiit U i^solv^d ititii^'iia^fit^kiiimi «4Snl<itidl8.' P6t'"ttie 
example ."^Tt/riiBTT prints' 



(THE SDffUr SBNtBNCES AlE) ' '^ " a 

(THE NUMBER (OF /OP) CUSTOMEKS TCM (GBTS / VERB) IS 

2 (TIMES /OP 1) THE (SQUARE / OP 1) 20 (PERCENT / OP 2) 

tm§ miftjBR CO* / 1») A»mttiattan* (^-7ipr6> mts t$ 43 

((WttAT / QWbRD) IS TBE inittit (Of / OT) €^^ 
<(^fiTS''/'VtteJ)f (QMfc«e'i/-l*Jt)y-^" "^''' -- •''" '^"'' '''■' '-' ' 



Each simple sentence 18 a Separate list, f.^.t i« ^felbsl^ iii jJaVen- 
theses, and each ei^s with a deliwltei^^a )pei^ic^''i^ iiakii^oti^^Mf^'.''' 
Each of these sentences can now be tr^ttifbri^ M^ci?!^ t^ Iti' Inter- 
pretation in the model. 
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JChe trans for^jat too ^roin.the„||mpJ^^ l^^rnel ^^nt^oces to equa- 
tions uses thiree levels pf precedence fqi; o|.er.*tQt:8^ , Qpera*fQi?s of 
higher precedence level are usee} e^rliijir in l^^e trant^©rB»ation. Be- 
fore, utilizing t^ie operators, ;?T^p:5 4.qoHf for |.i^^ 49^'"s, 
associated with the equality relation. These fornjis, .l^cli^de th^ Qppula 
"is" and transitive verbs in certain contexts. In the example we are 
considering, only the copula "is" 4^ ua^ed ,to^jl,n4.i<|atf^jeqi^tl,i,t:^.^^^ ^^^^T^ 
use of transitive ver he as ind,icat«rs, <?€ ^equality, thft i,9, ,a,8 rela- 
tional linguistic forms, will be discussed in conq^c,<t lQn.uj(|;h! another 
example. When the relational linguistic form is identified, the 
names which are the arguments of the form are broken down into 
variables and operators (functional ,l,|Qg4^.t4iC.<|G^rq|«]bi>' ::lQ>;the present 
problem, the two names are those on either side of the "is" in each 
sentence. . , . „. 

Th* word "is" jpy i^lso ^e ij^^ed B»eani^^^j^|,l^^it^hiin\al^bra 
story problems as an auxiliary verb (not laei^ifi^ flW%?f¥^y)i /"^ such 
verbal phrases as "is multiplied by" or "is divided by". A special 
check is niade far ;the .occija:^jeac^ro£ th«8e ^{^ 1|et|(^if. f^ioceeding 
on to the main transformation procedure. The t(X^%fi^^xvf0tym -oi sen- 
tences containing these special verbal phrases will be discussed later. 
If "is" io&a jifiX Sk^fff^x SB ai?, j^u3|i|rff^^ 4» ?*»cb a vext^aV^pase, a 
sentence of the form "PI is P2" iq. iitte^Pi?!^^ a%5j:^^ea|io# the 
equality of the objects named by phrases PI and P2. No equality 
relation will be recognized within these phrases, even if an appro- 
priate transitive verb occurs within «lthex,Qf them. If fl*^ and 
P2* represent ,th^^,ja3C,it)ijnafti.ctraM 
is P2" is, transfpnoed into t)ie equation , r . .- 

"(EQUAL PI* P2*)". 
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. ^fi'^'^KI^^'^'^'if^^^'^i^^-^'i^^.^f^^r-^: 



The trsnsfarmacion of PI and P2 to gjt'«%^-^kem «m'lltterpretatl«n 
in the model is performed recursively using a program equivalent to 
the table in Figure 4. Ih is table shows ail' the operators and for- 
mats Currently recognized by the STUDENT program. New operators can 
easily be added to the program equivalent af this table. 

In performirtg the transformation of a phrase P, a left to 
right search is made :^or an operator of level 2. (Indicated by sub- 
scripts of "OP" and 2). Jf there is none, a, left to right search is 
made for a level 1 Ope*a tor (indicated by subscripts "OP" ahd 1), 
and finally another left to right search ^s m^de for an operator of 
level (indicated a1^ a subHBcript "OP" and no Rusierical subscript). 
Kie first operator fOftitia in this ordered rf^fch determines the firsW' 
step in the transformation Of the phrase. "^Is operator and its con- ^ 
text are transformed q^ indicated in coJ^HW- ^ <!«. the table. If no 
operator is i>r«8«fiCV delimiters and a#^lltl«i-f«^ an and the) are de- 
leted, and the phra«pi is treated as an indivifi4,]|)lje entity, a variable. 

In the example, the first simple sentence is 

(TIffi NDMBER (0F/tn?> ^6981*^18^ ^elf teETS /VM^ IS 
2 (TllffiSA)P 1) TijrtS^IK/C*^*iy2& (!«ICl»T^ 
((»/(») TBB »0MBKKt(#/9f^ 'Alf?«IttlSa«NTS 

This is of the form "PI is P2", and is transformed to (E((tM. PI* P2*) . 
PI is "(THE NOSKJR tOf/O^) CtJSltftffikS Itlt ' ((il^iS^ 

rence of the verb "gets" is ignored because of the presence of the 
"is" in the sentence, meaning "equals". The only operator found 
is "(OF/OP)". Frran the table we see that if "OF" is tnmediately pre- 
ceded by a number (not the word "number") it is treated as if it 
were the infix "TIMES". In this case, however, "OF" Is not preceded 
by a number; the subscript OP, indicating tbait "OF" is an operator, is 
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operator gri|p<i»R> ,-^?pitf»t ; ^crffrMlMiMa 1« «>• ■■"■« 
Laval 




PI,IIS .-•. . V.2 :. - , .,, -Pl,,»WWW ^-.o.;- -J.'.. ; eaai^Fw»2«^;.;-- - 


(a) 


"•OSS ,0 _ P1,FU»$^P2,,.,_-^-; ^^^,, (PW»,PW,||^^,, 


(b) 


MINUS 2 PI Miiras P2 (PUIS pi* (jam «*)) 


(c) 


Mimiss PI Mnoss P2 (plw pi* Omws p2*)) 


i» 


TMW ^ ■1. PI,IPIE8P2, . ;■, , .. : ^^T^§S^Vf l(^y. 




nmt 1 PI DIVW P2 (QIBMWT PI* P2*) 




SqatM " 1 SJJOUBPl - (BtiT#&f2) 

sooAixD : pvsqaMOD iGnwipwt%) 


(d) 


** p . ■ Pif*,!?^, v.; :..:-[),> U<=f^f**^|'»*)l .. 




LBS8THAII 2 PI USSTWUI P2 (HUM P|* pBDTO '!*)) 

PiR -'^ ' pi'tii' K #2 ''''"'■'"""■"' \<»»Tiiir PI* '<KP2^ 




(•) («) 


■^MrvHTMtw--. :;-.-^.^iWteftt*l*^i^*t)*>'- 




PEiCEW. 2 , piiJ«|<?priP? <niWW«j»)*i ^ 


t« <«r 


PHOMS 2 PIKPPH^SW <P1(<^10<»-I^,/1I») «)• 


<« <«> 


SOf WM PI AMD P2 Aio P3 (PUW Pi* (iw P2 i» PS)*) 




■■^^8»pfA»«^''- '-- -■■tipiJW'w^'i^"''''^'^^''^' 




Dipran^a O ,..D9iMipm»9Piunb a:isi»MI!;fWbapa"i:«*>J - 




OF ■■ - ■ ^ ^ ^^ x-iim-' ''^ '''-' ^ ^'-"(tntaTKrt*^'"'' '^ '■' 




PI <» P2 (PI » P2)* 




(a) If PI la a phraaa, PI* indteataa Ita intarpratatloii in tha aodal. 




(b) PLIWS wwlMpwy «*.4«aiMM*«lH«f»lJp#a*9 WW'flF^^^'IWjIFWf^"** *•^^• 




(c) , Wh» (iw^iioff^kU «>pt««» fPf* fl^J#i«*^ *§ «w *»*« •»»«»»■ 




(d) SQUAU P4 aM ffiH t^ figff. Mtofa**? *os?»l^!^^ «id «W » PI. 




(a) * otttaMa a paraMlMaUad «ani^<w^^ jUMttfRitf*,-|^tr^Pi^«ffifaad phraaa ia 
to ba eranafotaad. 




(£) K ia a w*W» -. 




(g) / a0d - lavly that t|M Mlie|M;«4 ar^Jfpatlif |8«f^iapa #^ flNM»|»f)^^««MP»*- 





figure 4; o^erattits IteCdfenizigd bjr StPPBNT 
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Striked mmy,. pnd t3h»i taDaiorfor^iatitti ^latcmmmAmmmg^trntmiiiMi'tiui'^^ 

To the right of "IS" in the sentence is P2: 

(2 (TDffiS/OP I) THECS(9IARE/0P 1) 20 (PSia»T/OP 2) (OF /OP) 
IRE NOOBR^^ <0fi/OP$ lAaNPMXSBOHrasXlBl^il^iU^ 

The fir«4 ot>«a[^i*cn:> foindtin i^rl^aBKaMl^ ImiOf*^ i^f^ Ie%=«l 'i 

2 . Wrfmi th-* tflMbn in, Wi>^ate 4, we^ ffMu tfaMc^htU syftntdrJ &as> lli^Wf f«it 
of divMli^ t]»« ma«h«rriaiHidijil»liri9TV««<Milt dt^vC tOOt* aStei T^BROENIf'^ 
iso f ]»Bonr«d ; jitid th-r ttm^^aanBiMci.om±mi tf pa tfed Mi^tftieuirenili:^«#pl«rdi%l»-^ 
la the-'«X«ii4)le.nth«"?'i.-i» (PMiaan;!^9»32|ji<«i/«^«««W MtoH^ TK:iL.irr? vc 
"...' .2000<OTy&P)<...i.-,"*-J ...-j .,■ ;a:.!l Ivq ^'if hs:i..D-t3 ar-^n.-;-., iktJ 

Continuing the transformation, the operators found are, in 
order, TDffiS, SQUARE. OF and OfiuVEtt^tEmt MnOIedHftl r4U0Xci¥#a in 
the table. The "OF" in the context "... .2000 (OF/OP) THE ...." 
is treated 98 an. iaf it WSM4^^t63rtlW}aW(it%mmei^ri&§mir4m0Uf "OF", 
the operator iMrkiJIg i« JtJili<W«d«:iJ^lbiTiiili4til^iiifiiit 
sioni^forSPt-ttlT; ^ .,::/}'/■ ^:^■^) ■%.:: RHSHltTfflJ:) 10 :Tna3JM) JAlfpS.) 

(TIME 2 (EXPT (TIMES .2 (NUflKS (X ADVEKTISEMQITS 

The transformation of the second sentence of the example is 
done in- ■« $iall«r''Aaaa«rj:'iandi])!teld8 i^itse^HHtticati^rfis :;-:,ut ,^ ' qi^n \ - -.n ^ r^ 

■■■■' V: ■■■■ :[■• ,j/i:-(U ■' -r, '-:.;"''-B .voibd rrwOiii: .v;^-; ;>'foiq -.'i.? :;i 

(EQHM. CHiBm» QE^yiiinramiSBgBiTs <«/»«»)> iiiis> 4^^^ 
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The third aantence is of the form "What is Pit?". It starts With 
a question wo lA and: is tfaerefoi?e treated s{M3cisl|yk A unique vati^ible, 
a single word consi«£lmg of an Xbf 6 fotlotjtfdij^ five tntegelrs, 
is created, and the equation (EQUAL Xnnnnn PI*) is stored. For this 
example, the variable XQOOOl: WiMli (E^ewtfecEi «^ sen- 

tence is transformed to the equation: 

(EQUAL XOOOOl (NIMBER OF CUSTtSffiRS TCM (GEOS/VERB)) 

Iniadditton* the created ■'varl^le IS^^pJiaBced on the lt«t of variables 
for which STUDENT is to find a value. Also, this variable is stored, 
pa ir^ With Plistfee u«*r^tt»f6aM*d;tigbt 8ia»i >3EoT'«re lia printing out 
the ansvser . If a vah^ is fowed: for this, vartiMi , STSt^IT p^rlnts the 
sewtjenee (PI is value) with tke appropriate Sfdrntiti^tieM for value . 
Below, ve-sib^Wtbe fu41<s«t ofieqttations:, aiul:th6'{»'iftt«d sb^^lutlon given 
by STUDENT for th» exampljg being:; cdtJ|rii£i3rMi 9&r edst in Sdlution, the 
last equations created are put first in the list'of equitidns. 



(THE E(^ATIONS ID BE SOLVED ARE) 

(EQUAL XOOOOl (HOfBER. OF CUSTOMERS 9M ^GEfS/VERB))) 

(EQUAL (Nllf»BR0rtAD«EITISBfENf8^/Fl(l)sRI)MS> 45) 

(EQUAL (NUMBER OF CUSTOMERS T(M (GETS /VERB)) (TUtES 2 (EXFT 
(TIMES .2000 (NUMBER OF ADVERTISQffiNTS (HE/PHO) RUNS)) 2))) 

(THE NUMBER OF CUSTOMERS XC»( ^TCS IS 162) 



In the example just shown, the eqoality relat^ion was indicated by the 
copula "is". In the problem shown below, solved by STUDENT, equality 
is indicated by^tfee oe<lttti«oce i^r« €tieiari%i^^ «§iia»'iild t^^ p^roper context. 
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(THE PROBLEM XQ BE ^LVH) K) , 

(TC»1 HAS TWICE AS HANY FISH AS MARY HAS GUPPIES. IF MARY HAS 

3 GUPPJES, WHAT IS THE pWB^ (m ,IM^ "JXSfi Si^ Q'> 

(THE EQUATIONS TO BE SOLVED ARE) , 

(EQUAL XOOOQl <NUMBER OF FISHt 3X»I (BjAS^ESt^)) ) 

(EQUAL OpiBER OF GUPP^B^ J^CAIOC/^^ (pASjO^^^)) ^ 

(EQUAL (NUMBER OF FISH TOM (HAS/VBR^^-^pM^ ^ .!(1«WBER OF . 
GUPPIES (MARY/PERSON) (HAS /VERB)))) ^ ^ 



(THE NUMBER OF FISH TDM HAS IS 6) 

■ ■-' ' ' ■' ' ' -'■ . ■ ■";■; T/-V'' ><i'.\A 



The verb in this case is "has". The simple sentence "Mary has 3 
guppies" is transformed to the "equivalent" sentence "Hie number of 
guppies Mary has is 3" and the processing of this latter sentence is 
done as previously discussed. 

The general format for this type of sentence, and the format 
of the intermediate sentence to which it is transformed is best ex- 
pressed by the following METEOR rule: '' 

(* ($($1/VERB) ($1/NUMBER) $) (THE NUMBER OF 4 1 2 IS 3) *) 

This rule may be read: anything (a subject) followed by a verb fol- 
lowed by a number followed by anything (the unit) is transformed to 
a sentence starting with "THE NUMBER OF" followed by the unit, fol- 
lowed by the subject and the verb, followed by "IS" and then the 
number. In "Mary has 3 guppies" the subject is •'Mary", the verb "has", 
and the units "guppies". Similarly, the sentence "The witches of 
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Firth brew 3 magic potions" would T»e tt^wftfefd^ftae^ £0 

"Th* iS&ft>^ of 'iirgic^ pbtioW« tlte wid^lSfiS df Plrth brew is 3." 

In addition to a declaration Wfmmbyr/ a iittjfte^bbjeC^^ tran-r 
sitive verb may be used in a coaparative structure, such as exhibited 
in the sentence "tOtt has tvlce as many fish «iS'Haty has guppies." 
The NETSaR rttte iAlth' glihes the effective ttansfo'^t ion for this 
type bf setttehce struct lire is: 

(* ($ ($I/VERB) $ AS MAMY $ AS $ C$,l/yEM^ J) 

(THE mmSEk OF 6 i 2 is 3 THE NlMffiR OF 10 8 9) *) 

For the example, the transformed sentence is: 

"The number of fish Tom has is twice the number of guppies 
Mary hasl' 

Transformation of new sentence formats to formats previously 
"understood" by the program can be easily added to the program, thus 
extending the subset of English "understood" b^STODBTC. In the pro- 
cessing that actually takes place within STUMENT the intermediate 
sentences shown never exist. It was easier to go directly to the 
model from the format, utilizing subroutines previously defined in 
terms of the semantics of the model. 

The word "is" indicates equality only if it is not used as 
an auxiliary. The exaiiq>le below sham how verbal phrases containing 
"Is", such as "is multij^lied by", and "is increased by" are handled 
in the transformation. 
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(THE PROBLEM TO BE SOmES TSiy 

XA HfiHBffil IS MBtttPtna^ Bt 6P Ilk# SioitRiT IS ttCRKASED BY 44. 



(THE EQUATIONS TO BE SOLViSD AKS) 

(EQUAL XOOOO 1 (NIMBER)) 

(EQOAL (tt^ (fams t*rtiftB8> 6) 44) 6§) 

(THE NUMBER IS 4) 



The sentence "A number is multiplied by 6" only indicates that 
two obj«ct8 in the «odel arie tel^tied awtrtpMrektivfeij?, atifl dbes not 
indicate estplliett; ly any tsqtia lity relatlotr. tfie lint er-pretfit ion of 
this sentence in the model is the prefix libtttidti product: 

(TIMES (NIMBER) 6) 

This latter phrase is stored In a temporary loeatlon f Or possible 
later refireacei In this problem, it is referehce<i-itt the iiext sen- 
tence, with the phrase "XfilSPBOWJCT"; The ii^ortant wOrd In this last 
phrase Is "THIS" — STDBBNT ignores all other words iii a variable cdii- 
taintng the key word "THIS". The last- teB^bririly'stOt'ed phrase Id 
substituted for the jftrase containing "THIS"; Thttai, the first thitee 
sentences in the problem shown above yield only one equfition, after 
two substitutions for "this" phrases. The last sentence "Find the 
number." is tranaforraed as if it Were "What is the ilutaber ^.", 
and yields the first equation shOWta. : 

The word "this" may occur in a conC€^ where it is nOt 
referring to a previously sttfr«d phrase. Befow is aii example of 
such a context. lii 
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(THE PROBLQl TO BE SOLVED IS} 

(3HE FlUXait OF A S^IO IS 69.^70 miMS& . IF THIS PRICE IS 

is FERdMT t^SS ip^ THE MASKED BjKECE, FDIOr TB& MARKED PRICE.) 

(THE EQUATIONS ID BE SOMITH) ARJS) 

(EQUAL XOOOOl (MARKED PRIC^)r 

(EQUAL (PRKP OT RAl^O^ iC^Pf^ *S#S9 QJAf^^^RJCE) ) ) 

(EQUAL (PRICE OF RADIO) (TIMES 69.70 (DOLLARS))) 

(THE MARKED PRICE IS 82 DOLLARS) 

In such contexts, the phrase cpntatoingj ''THIS!' i*,:^pkMc6d by the le£t 
half of the last equation created. In this, .fsjt^fppj^, SPJDiSNf breaks 
the last sentence into twp^ia^jle Sfntencfs, depleting the "IF". Then 
the phrase "THIS PRICE" is replaced by the variable "PRICE OF RADIO", 
which is the left half of the previous equation,; i. 

This problem illustrates two, other featiires of the SIUDENT pro- 
gram. The first is the action of the cf^aqpl^ op^^^ttr "peiststi less 
than". It causes the number inmediately preceding it i i.e., 15, 
to be subtracted from 100, this result divj^de#Kl^piaO,^td give .85 
(printed as ,8499 due to a roimding error in floatiiitghpoint conversion). 
Then this operator becomes the infix operator '^'EOiiSi". . T^ia is in- 
dicated in the table in Figure 4 . 

This problem also illustrates Jwwi tHiits *uchb as Vyotllars" are 
handled by the STUDENT program. Any v|oir4wt^ich immediately follows a 
number is labeled as a special type of variable called a unit. A 
number followed by a unit is traatjed in, the equation jaw a product of 
the number and the wit, e.g.,%9.70 DQLLAJlSf' bM|OiB»s JIXIINES 
69.70 (DOLLARS))". Units are treated as special variables sin solving 
the set of equations; a unit may appear in the answer though other 
variables cannot. If the value for a variable found by the solver is 
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the product of a number and a unit, STD^IT concatenates the number 
and the unit. For example, the solution for "(MA^CKDi TTKTCE)'* in 
the problem above was (TIMES 82 (DOLLAI^)) and STDDENfT ptinted outt 

(THE MARKED PRICE IS 82 DOLLARS) 

There is an exception to the fact that any UAit may appear in 
the answer, as illustrated in the prbblem below. 

(THE PROBLEM TO BE SOLVED IS) 

(IF 1 SBMI EQUALS 9 INQiBS, AMD tl EA3H(M EQUALS 6 FEET, 

HOW MANY SPANS EQUALS 1 f^WOH Q.) 

(THE EQUATIONS TO BE SOLVED ARE) 

(EQUAL XOOOOl (TIMES 1 (FATHOMS))) 

(EQUAL (TIMES l(5?ATHQMSl)) (TIMES 6 (FEET))) 

(EQUAL (TIMES 1 (SPANS)) (TIMES 9 (DTCHES))) 

THE EQUATIONS WERE INSUFFICIENT TO FIND A SOLUTION 



(USING THE FOLLOWING HTOWN RELATIONSHIPS) 

((EQUAL (TIMES 1 (YARDS)) (TIMES 3 (FEET))) (EQUAL (TIMES 1 

(FEET)) (TIMES 12 (INCHES)))) 



(1 FATHOM IS 8 SPANS) 

If the unit of the answer is specif ied, in this problan by the phrase 
"how many spans " — then only that unit, in this pt©l>l«iiii "spans", 
may appear in the answer. Without this afestricttcm, STUB@IT would 
blithely answer this problem with "( 1 FATHCM IS 1 FATHOM)". 

In the transformation from the English statement of the problem 
to the equations, "9 IN^DBS" became (TBPS 9 (INDIES)) . Howevar, 
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"1 FAraqH" beoame "(TIMES 1 ^EATO^IS) ) " . The plura^ fo,rm for fathom 
has been U8«d iiifttead of the singular fprm. Stj^PT always u^ses th.e| 
plural form tf known,: to ensure that 9,^ gi^ltjR^^apsear in p^^^ 
form. Since "fathom" and "fathoms" are differenti if both were used 
STUDENT would treat them as distljv^|:,, unr^lft^^ .u|4^?, , IJie plural 
form is part of the global information that can be made available 
to STliPESn, and the plural form of a word is su^f^itiited fpr any 
singular form appearing after "1" in any j?hrA|8e^.^ Th?. inverse opera- 
tion is carried out for correct printout of the solution. 

Notitie that ti^^iiaf(»mati)m givfia in tberiprwblMiwas insufficient 
to allow solution of the set of equations to fie solved !i Therefore, 
STUDENT looked in its glossary for information concerning each of the 
units in this set of equations. It found tW retatibnshipi*'! foot 
equals 12 inches." and "1 yard equals 3 feet.** tfsing onty the first 
fact, and the equation it implies, STttteiW is then able to id Ive the 
problem. Thus, in certain cases where a prdl^ Ian i^s'nbt analytic , 
in the sense that it does not contain, explicitly stated, all the 
information needed for its solution,'^ Sl'raifiNl! Is at>te to draw on a 
body of facts, picking out relevant ones, and use them to obtain a 
solution* 

In certain problems, the transformation process does not yield 
a set of solvable equations. However, witfcMitliiilslSt* of equations 
there exists a pair of variables (or more than one pair) such that 
the two -vaarlaJiles as««oly "■li^tlydiCf«€iit"?"»«B4<r«atllly -name the 
same object is the model. %ftimi a set of eqaattpne ts TmSflw^blei 
STUDENT searcheff for relevant global eepMrtJtofes. Iffsadditioni It -, 
uses several he«fltl«ti« tecihMqtt**! foi? i*eotel%irigrtjwo w^ntgfcttly fi 

different" variables in the equations. The problem below illustrates 
the idei^ificaition «vf^!vK> .variable ^<<re in <nM Tvariabi^e a pcdnoiiua 
has been »ab«*:itutedfe>r a noun phrase In Jtiie'';6*lligt i^ariable. This ' 
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ideutif ^c«tion Is psde by qhecHing.all v^rj(|^l«ff ay»eiBrii^| „m o ^^ .pone 
containing tl»ei)EQnoun,^n4 findingv^^f^icli i,sfyj,4f9tf.cf!l to t;^^;t8 . 
pronoun phrfsej, wi|h. a su^stltutfipo ,o£,a ,j|^|ing, ,g| |||^y l^ug|b ?Qr . y; 

the pronoun. ■■,.,-, 



(THE PROBLEM TO BE SOL¥fD 




II iliuiC 



(THE EQUATIONS TO BE SOLVED ARE) 

(EQUAL xooooi imaoSk^mmmiMMs^XTSSiintiifHMit/VEm))) 

(EQUAL (NIMBER OF SOLDmi IBSSIAIK (HAVE/vERB)) (TIMES .5000 

fHE EQUATiONg ttERE,I^^|i|3^|§^f r^^H^^ 4TW^®* 

((Nffi^^^^'ffeM)^ WS^^-hmJAL TO 
(NIBffiER OF SOLDIERS RIKSIAKS (HAVE/V«BI») 



If two variables match In this fashion, STUISNT assumes ttie two 



variaWes,.a?s ^f^h^M^f:°^(^ ?^l5f»S?'=(MimS3ft«"?86f4"' ^« 
shown, and adds an equation expressing this eq^l|Lty^,|o^.t^l)ci set 

to be aol^4ki^ 'Thi^s<>tU£toaipt(ieadui»4Dli0teidiJ«iaIfi,>lM)Sil«ttls 

^j. >'■-■- 5,'.^ ■■i^A_,>jJioY W!;> psij^raa 51?; wo (may bad 20 smojjao 
additional equation. In the exan^Ie, the aaditlonal equation was 

sufficient to allow determination of the solution. 
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The example below is again a "non- and lytic" prbbleth. The first 

set of equations developed by StUDfilT' is tiftsoI^kbl(^. ifherefore, 

STUDENT tries to find some relevant equations ih its Stbre of glo- 
bal information. 



(THE PRDBIiEM TO BE SOLVED IS) 

(THE GAS d^Sffltftios dr^ ^ i| 15 m|K m gAtLON^ 

BETWEEN NEW YORK AND BOSTON Q.) 



(THE EQUASKMIS TO BE SOWED AilE> i 

(E(HTAL . xoooo 1 csmssm^Qiw eauiem of g»S74»ssd on trip 

BETWEEN NEW YQRK AND BOSTON)) 

(EQUAL (DISTANCE BETWEEN BOSTON «ND NSW YORK) (TIMES 
250 (MILES))) 

(EQUAL (GAS CONSUMPTION QF MY CAR) (QUOTIENT (TIMES 

iSOiiLES)) (fliiEri iei^LtoNS)))) 



THE EQUATIONS WERE INSUFFICIENT TO FifW A SOLUTION 



(USING THE FOLLOWING KNOWN RELATKMISHIPS) 

( (EQUAL (9IS^CE) ,(TIMES^ (PPEE]?) (XJME) ) ) (EQI|A|, (DISTANCE) 

(TIMES {m (xMmhoi^rivmm'iiif miMim htAs used)))) 



(ASSUMING THAT) 

((DISTANCE) IS EQUAL TO (DISTANCE BETWEHJ BOSTON AND NEW 

YORK)) 

(ASSIMING THAT) 

((GAS CdtlStniPTION) IS EQUAL TO (GAS CONSUMPTION OF Mf CAR)) 

(ASSUMING TfiAfj 

((NllfBiR OF GALLONS OF GAS USED) IS E(^AL TO (NUMBER OF 

GALLONS OF GAS U?ED ON TRIP BETWEE^ NEW YORK AND BOSTON)) 



(THE NUMBER OF GALLONS OF GAS USED ON A TRIP BETWEEN 
NEW YORK AND BOSTON IS 16.66 GALLONS) 



It uses the first word of each variable string as a key to its 
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glossary. The o«(e reception to this rul^ ip 'thf^fhi^.v^is "tamper 
of" are igtiored if th-^y are th* ft?«t. t^iro fS^o>rd#rj?f a,.yari«i»l^ striog. 
Thus, 41J this problem,, STUD®«t,r#trl^v«4 ii^iMtAofi#c.wl>44Jh were, stored 
under the key words d^t^ceu gaO-lgM . gpsy ,^BtdH n^eR ,..Twptf^fi»ctg 
about distance h^d been atored earligr-:,^ Vdi»f anceaa%^a|s ^p«Bd. times 
time" and "distance eqj*P.ls g^s cpnsonq?tioati»?»niP*pr'«f= gallon 
of gas used". The equations implicit in these sentences wsrie 8*ored; 
and retrieved now — as possibly useful for the solution of this 
problem. In fact, only the second is relevant. 

Before any a^ti^^apt is inade to solverHfehis^iflijig^eRSed ^«t of 
equations, the variaMe« in £he #ug»§|it(9d f/||fe.^3fesfBf t«;Jiied, to identi- 
fy "slightly differfpt!';variabieS;w|4ch*r«f«ri6r.6t»eseiBe object in v 
the model. Inthl^ example "(PXSTANCE>"^?^^gASr;S(fllSUien9N)'' and 
"(NUMBER OF GALLONS OF GAS USED)", are all lde»t4l*ed with "similar" 
variables. The following conditions must be satisfied for this type 
of identification of variables Fl~and'?/i: 

1) Fl mupt appear later in the ,Hrpbie»: th«n P2. 

2) PI is completely contalrifed th fi in the sense that Fl 
is a contiguous substring within F2. 

This identification reflects a syrataetic pfeenoTOftnoianf^ere a 
truncated phrase, w^th o]cte or more #K?44fy|fig;4f%irases drflppf^, is 
often used in place of the original phrase. For eiamtfie, if the phrase 
"the length of a t^ttanglei" 1rai occitrr^d^ 'tttfe iphrai»jfe'^^th# ;lfength" 
may be used to mean the same thing. This tjrpe of identification is 
distinct from that made using pronoun substitution. 

In the example above, a stored schema was used by identifying 
the variables in the schema with the variables that occur in the prob- 
lem. This problem is solvable because the key phrases "distance", 
Vgas consumption" and "number of gallons of gas used" occur as 
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substrlfigtf of tfeifr va^4alrl«« Ih the proMeot. Si4(fe*^OTUMSIT laentl- 
f left' *ath^ f^tiktlt key jArti* of tfie- ifehiattr ' iirith * a^ ^tlfcular vat i- ' 
able of tlie pfobltot, • ai^ btfveAa cati be '%«e#^Mi5^ origs^ 4ft ' 4 l^robl^em. 
Because SITIPIKin! lantlM softeoMiM tWiiai. jrife-hcifc JJfiMJMan' kt' ' ea^ibt' ' ■ ' ' 
solve iMTobleoM in «ltich a f«I«tlonfthlp i^cih^^ iM^*'dl#^ 
speed tlmeis tiiie" i»H«6ei«d f of tunbi HdlTf^r^^e^ %altt6i drmtianct&7 
speedy ;ai»3 ^iai6. ''^ -:'-■•'■•'' a ' • - ..;.-;■ iEi,;p.:S '.fi »'•'.--. ^. 

E. Possible Idiomatic Substitutions. 

There it^ Jidtttt il&lrisea i«hiie1i' have a diiai chifieter^ defiettdli^g 
on the eoht ext . tii- the ' i^aiipie- heliMi^ the -ph^isii^' ^'^%#iiaeteif of a ^ 
rectangle" becoaea a vartiblei|rlth^Mi^''ref4ti^te'^to^tti'^iittibg, or 
definition, In e«fttir6*^»6hi length SiatfJMliiH df^fhiPrfe^liai^ie. 
This'definitioft^U «tfifti*d€aofor' Aolseion;-i^^3li "AD to BHCaJ ;, ;^^ '■^■■; 



(THE PBOBLEM TO BE Se£f^!lS>'^ s n 

(THE SIM OF IHE PKRIMBTKB (»r A BECZAlHilJS AND THE FE&IMETER 

OF A TRIA86I.B IS 24 mcraffl. IF WE KlBeTSR OF THE RECTAHO^ 

IS mzsat^'tam^ttammM of wB^issmBm^mm^ is ikE 

Pl^jqEMgT^ OF Tl^ ptlAIB^^Q.) , , , : „^, r - 

(THE EQUATKHiS TO BE SOLVED ARE) 

(BQUMifiXOOOOl (PtR^BUl 0F«llIAIieEE^J>' ' ^ 

(fiQlML (raRBfB#SR^IOF RE^lHfeEB^ (flHiS 2 '(PBRZHBTIIR 6F 

(l9Ufti. , (Pi^ys , j;?ermetp o? wgqcA^^a^jf !;?i?|pjeter,p|! ixims^y ) 

(flMES 24 (JKCms))) .^ 

(THE PERIMETTO of THE TRIANGLE li 8 INCfflS) 



However, the following problon is stated in terms of the peri- 
meter, length and width of the rectangle. Transforming the English into 
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(THE fRMLEM -RX^ftE SOiVfeD fS) '"-'- 

(THE LENGTH OF A RECTANGLE 1$ 8 INCHES MORE THAN .THE WIHTH 
OP TIW RiSfAN«#a OIKNAlF OF WE nii^fllfriK^^^E kiO«iai«Lt 
IS II INCHIS . FIND THE LENGTH AND THE WIOTN Of TH^ RECTANSLf . 

(THE EQUATIONS TO BE SOLVED ARE) 

(EQUAL G02516 (WIDTH OF RECTANGLE)) 

(EQUAL 602515 (LENGTH)) 

(EQUAL (TIMES .5000 (PERIMETER OF RECTANGLE}) (TIMES It (INCHES))) 

(EQUAL (LENGTH* OF RECTANGLE) (PLUS aiMES I (llicMES)V(Mi^^^^^^ 
,0F.||ECTANW1)W. ■ - ,.. ::,,,M.- -3^7 .;,/n ■.; :-.:.-.-. 

THI EQUATIONS MERE I lisUFFi CI ENT TO FIND A »6Li)Ti'a«l 

(USING THE FOLLbWlNG l£N«)*«l ilELATIOiilSHlPS) 
((EQU/M. (TIM^ ;^ (CCET)^} (IlilES M 4IJ|^E&|,») 

(ASSUMIIIG THAT) 

((LENGtH) I^IqUAL to (LENGTH OF RECTAWGL^J) 

THE EQUATtdNS WERE INSUFFIcrENTiri6'Frfitf*iSiW^ 

TRYING POSSIBLE IDIOMS 

(THE PROBLEM WITH AN IDIOMATIC SUBSTUTION IS) 
(THE LENGTH OF A RECTANGLE IS « INOIES MORE TWIfiitE ifdTH 
OF THE RECTANGLE . ONE HALF OF TWICE THE SUM OF 1WE LENGTH 
AND WIDTH OF THE RECTAIffiLE IS II INCHES . Fll» TIME LEIMTH AND 
THE WIDTH OF THE RECTANGLE .) 



(THE EQUATIONS TO BE SOLVED ARE) 

(EQUAL G02S1I (WIDTH OF RECTANGLE) > 

(EQUAL G02S17 (LENGTH)) 

(EQUAL (TIMES (TIMES .5000 2) (PLUS (LENGTH) (WIDTH OF RECTANGLE))) 
(TIMES II (INCHES))) 

(EQUAL (LENGTH Of RECTANGLE) (PL«S (rUffiS i ( IIMHtS»> 4KIDTH 
OF RECTANGLE))) 

THE EQUATI(M««RB INSUFFICIENT TO JliM) A «tL*WtOlt- it, 

(US*N6 THE f(ttA»iiMQ laWWN JtELATJON»HlJ»« , 
((EQUAL (TIMES 1 (FEET)) (TIMES 12 CINCHES)))) 

(ASSUMING THAT) 

((LENGTH) IS BMWL TO ^LENGTHfOF BECTANWJDc^v 

(THE LENGTH IS IS INCHES) 

(THE WIDTH OF THE RECtANGLE IS 5 INCHES) 
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equations is not sufficient for solution. Nei!b;iii«7 iretirieylng and us- 
ing an equation}ji|»Qttt '*iiieSe»?'^,. fefie unit: tn ftKe |»iobl3^r'i»pr tdenti- 
fying "length""with a longer phrase serve to make the probiem sol- 
vable. Therefore, STUDENT looks in its dictionary of possible idioms, 
and finds one which it can try in the problem. STUDEST actually 
had two possible idiomatic substitutions \^ich it could have made 
for "perimeter of a rectangle"; one was in terms of the length and 
width of the rectangle and the other was in terms of the shortest and 
longest sides of the rectangle. When there are two po«^£bl^ si^stitu- 
tions for a given phrase, one is tried first,, namely, the pn^ STUPEIJT 
has been told about most recently. In thi? problem, fhe corjrect one 
was fortunately first. If 'the dttier had- -befeft f lYSC , the revised 
problem would not have been any more ^ solvfbl^, than t|ie^ ptlglnal, 
and eventually the secprM^.j^<;Q;rrject) 8u^a|;ttutfpn, would have 
been made. Only one non-mandatory idiomatic substitution is ever 
made at one time, although the substitution is made f of all occur- 
rences of the phrase chosen. 

In this problem, the idiomatic substitution made allows the 
problem to be solved, after identification of the variables "length" 
and "length of rectangle". The retrieved equation about inches was 
not needed. However, its presence in the set of equations to be 
solved did not sidetrack the solver in any way. 

This use of possible, but Bon-rmaiuiatorylditoDiatic substitutions 
can also be used to give STUDENT a way to solve problems in which two 
phrases denoting one psai'titeular vairiabie atdi^aifte dif^erihlt. For 
example, the phrase, "students \^o f^assed the %iia£«si^ti8 test" and 
"successful candidates" might be describing the same set of people. 
However, since STUDENT knows nothing of thte^t^alwdfld" and its 
value system for success, it would never identify these two phrases. 
However, if told that "successful candidates" sometime means "students 
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who pasised the admissions test", it wdiild i>e latite Vc^isblve a' t>rbW 
using these two phrases to identify tWe i^ V^lf^al^iB J ''Tffiul^'r pos- 
sible idiomatic substitutions serve the dual purpose of providing ten- 
tative substitutions of deflnltlt^h^, ai^'lfe^ £lt<*S¥*fi>ri % s^iiib- 
mous phraises. -^ == u. - v>f v 



F. Special Heuristics. ; v . : 

The methods thus far discus'sied have been applicable to the 
entire range of algebra problems. However," fm: special classes of 
problems, additional heuristics may be used which are needed for 
members of the class, but not applicaible to oWer |»%b1)letM^^ 'An 
example is the class of age problemB, as typfflted 1>y the prbljrem 
below. 

(THE P8DBLQ1 TO BE SOLVED tS) 4 

YEARS f^mmm bill S yAWBS. tfllX M 3 tBSS A^ OLD AS BILL. 

CfflE EQUATIONS TO BE SOLVED iRE) 

(EQUAL X0p001((?ILL/ ^^SOT)S^ApX)^,^^,, 

(EQU% (PLUS ((BILL / PERSpN ) ^^^jCE^T^X ^RSmO, S CMJCJ-E 

/ PE«sc«r) s ACT) (PLUS C(Siixr'#i:^^^\^Ti&* / ¥er!^) 

S AGE) ((BILL / PERSON) S AGE))) 92) 

(EQUAL (PLUS ((BILL / PERS(Mf) S (FATHER / PERSON) S AGE) 2) 
(TIMES 3 (PLUS ((Blli / PERSMp SJW^^ 

(BILL S AGE IS 8) . 

Before the age problem heuristics are use#, a pxaWeia «H9t be. 
identified as belonging to that cl^ss ,o£,^rpb|.|?np.,,,|ip|^Ny i^epti 
age problems by any occurrence o.f one o^ the fo|lj0^r«^^hraf pS;, "as^old 
as", "years old" and "age". This identification if made iranaediately 
after all words are looked up in the dictionary and tagged by function. 



75 



r^~-'- "~~":rSr^-"-?^-"->"-';'~^-"J: -■"-"-'- " -'"'■■■ ■ " " "' ■-=".""' ■ " """■■.-. - ■ "--" ■■■■.- ■- ■-.. r'-^i.-"" "-" " IE "- -r-Jrt!- — 'Vk"^,,,^^-^''^^ A^ " i-jJ^-S' ■;6^ffsKf.-»'-.r^'«i<f ■ 



After t;hje special heuristics are used the modified j)rqt>],em i^ trans- 
formed to equations aa,, described previously. 

The need for specia^l i^et^V^ds for a|(e., problems arises because 
of the conventions used for denoting the variables* all of which are 
ages. The word age is usually not used explicitly, but is implicit 
in such phrases as "as old as". People's names are used where their 
ages are really the implicit variables. In the exaio^le^ fpr instanfie, 
the phrase, "Bill Vs father's unucle" is i|^^ed i,nate^d pf , the phrase , 
"Bill" Si father's uncle's age". 

STUDENT uses a special heuristic to ma)ce ^ll; these ages, ex- 
plicit. Tq (do this, it must knpw which Wjprds are -"person wqrjds" and 
therefore, may be associated with an age. For this problem STUDENT 
has been told that Bill, father, and uncle are person words. They 
can be seen tagged an such in the..«,qii4t;;i4^*:.; T^e .", " |ql a 

word is tlieiSIDQQKE n^>^ei4<iBt3UH»^^}f(m {SwIMsi&vs^ UMled inlsii^d of 
"apostrophe" -'.s"" ..i^Mr'prpg^^aia^jQ^' coiwe K le pc|[ .'' '^^S3^y^|fy\^paeyit8 a 
"S AGE" after every person word not followed by a "S" (because this 
"S" indicates that the person word Is oelng used iii a pToissesslve 
sense, not as an independent age VaWal3le).Thii8, as indicated, 

the .piS^^e .'%i^^' f^^,;^;:^ S 

AGE". .-:. . .•• ■' '^ .'..uc^ ^ 

In addition to changing phrases naming pieople to ones naming 
ages, STUDENT makes certain special idiomatic substitutions. For 
the phrase "their ages", STUDMT substitutes ^ conj^iiiiction oF all 
the age variables encountered in the problem. In the example, for 
"THEIR A6^** Sttttt" substlt-iites "BILL' S iWftlffiil ¥ tlld^ S AfeE AKD 
BILL S^ tMimk ¥ iSr ^ WjL S AGI^" . %e piirases ^*aff old as^ and 
"years^ old" We' t:h€*li deleted as dummy phrases ' not' tiaviiig any meaning, 
and ''will be" a'nd^Sras*'^ are clianged to '%W"'. 'Thei'e is no need to 
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preserve the tense of the copula, a i^fe^j^Jne^BEerjTeMgta^Jsp fe^re or 
pa«t tfense is pr^en^ £ti <^ai p«rfl^%fe%|# ig^ f^ai§-^om now", 
or "3 years ago". 

The remaining special age problem beurlQtics are used to process 
the phrases "in 2 years", "5 years ago" and "now", "pie phrase "2 
years from now" is transformed to "in 2 years" before processing. 
These three time phrases may occur iiraediate^yCi^tfp: /flgnt l«|ord "age", 
(e»»*^ii "Billy's age 3 year* a^go") lor *t ;tl*tjil«tnqrfei«)af jtybfsjientence. 
If a time phrase occurs at the begitkiing ot Mtie i*<k^lk![c^t'^ "Implic- 
itly modifies all ages mentioned in the sentence, except those 
followed by their own time phrase. For exampfe, %^ "l^ yeil'i^-^ill's 
father's age will be 3 times Bill's age" is equivalent to "Bill's 
father's age in 2 years wlfl bc 3 tftf^^fffs a^ tti'Z'fiiTi^^ itow* 
ever, "3 yeart ago KBry*iB agje wia?"^ tla^ 4toarf«¥^ ai^e^i^ 
to "Mary's aiie 3 yiats ago if^ 2'^tttiefa^nn''«f «iga no^. Iftlls prefix 
time phfraae^ are haiitfidd by d tcft Wliiitfi&ig tlftSiW^ tflt* -a g^ 
modified liiy'^ttotiieV tln»-if*'rrfBa.''' "" -'i-^"''-' '^ ■- -^ - ''<"■ ■■=^^^■■ 

Aftfer these prefix (ihtas^s it!av^1>aatt xftia^rlbjitWdi eisch time 
phras^ Is tt-anslated dii»|)¥<>i>T?latfe=ly. /ftie ^#is^ ^^n '^^y^rs'* causes - 
5 to be added tiii iAe age It ^flovfii^'-it\^ ^^eatii^ i^' ci^e^s f 
to be subtracted from the age precettingfthli )p*^ ■ "iliW iJbrd "now" 
is deleted. ' ■ : -' '''h^^ ■■ ^t;;. :•■'-. oi -id; c^-,' ',,i -■■--;...; 

Only thfe s^iecial heuristics dc^crlbcRi' thUd f air t^re neceaaary tb 
solve the first age problem. The second age problem, given below, 
requires one additional heuristic not previously n^ntloned. This 
is a substitution for the phrase "was when" ^lclf3iJjii^Bl^aily d^i^ " 
coupled the two *adt«rSbtii^inetf f* the- -iftifit? ^i^^iy^. << t^ 'Sws 
when", S10B®lf%ubiifclttJteS 'W^^yie 
K is a lieW variable xire^tetl fbr irhttf iftjk'pt^S^? ; -^s: n oi i n:. 
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(•EHE PiKJBLHM TO BE «31<^a) IS) 

(ifARyas BJicE a:^ om /L^jm M^mm mabx i^as as ou) as ann 

IS NOW . IF MARY IS 24 YEARS OLD, HDW OLD IS ANN Q.) 

(THE EQUATIONS TO BE SOLVED ARE) 

(EQUAL X00008 ((ANN / PERSON) S AGE)) 

(EQUAL (< MARY / PERSON) S AGE) 24) 

(EQUAL (PLUS ((MARY / PERSON) S AGE) (MINUS (X00007))) ((ANN 
/ I^ISON) S AGEX) 

(EQIML iiwm / W^iSm) S AGE) (IBIK 2 (PLUS (CANN / PERSON) 
S AGB) (MBJUS (X00007))))) 



(ANN S AGE IS 18) 

In the example, the first .sentence bfscqmiQS the two sentences r 
"Mary is twice as old as Ann X00007 yearns a g^p, X0,0PQ7 years ago 
Mary was as old as Ann is novr." These , two pcjcurrences of time 
phrases are handled as discussed pi;evious.l3?, Sirail#yly the phrase 
"will be when" would be transformed to "in K, years • In K years". 

These decoupling heuristics are useful not only for the STUDENT 
program but for people trying to solve agej^rpbleps. JHie classic age 
problem about Mary and Ann, given above, took an MIT graduate student 
over 5 minutes to solve because he did not know this; heuristic. With 
the heuristic he was able to set up the appropriate equations much 
more rapidly. As a crude measure of STUDENT'S relative speed, note 
that STUDENT took less than one minute to sjalve this problem- 



G. When All Else Fails . 

For all the problems discussed thus far,, STUD^tC was able to 
find a solution eventually. lu sojue cases,, however, necessajryj glo- 
bal information is missing from its store of inforina,tion, or vari- 
ables which name the same object cannot be identified by the heuris- 
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tics of the program. Whenever SWDMI fi%nvif>i^ feiMad a solution for any 
reason, it turns to the <|ue8ti,osei« for he 1^. As 4n the i>rol?}.em 
below, it pf Ints out "CKi YOU ia«iW jAHnr^^i^K JIBJ^XHW^WS JP'W^^ 
THESE VARIABLES)'? followed by a list qf ithe rw^r^ta^bles f^a,it^f ;Pj;obleni. 
The questioner can answer "yes" or "no". If he says "yes", 
STUDENT says "TEliL ME"> and the questtQn?c ^an append another sen- 
tence to the statement of the problem. 



(THE PROBLEM TO BE SOLVED, IS) 

(THE GROSS WEIOIT OF A SHIP IS 20000 TONS . IF ITS NET 
WEIGHT IS 15066 TONS , tlHAT IS Tftfi Wtld!tT"6t iftE SMH 
GAEGO :Q.*)...:: ■ tudj . 

THE EQUATIONS WERE INSUFFICIENT TO FIND A SOLUTION 

TRYING P0SS1BI.E IDIOMS 

(DO YOU KNOW ANY MORE 13ELATl6»SHi¥s AjfeHfe T^SE VARIABLES) 

(GROSS WEIGHT OF S»tP) ' :"H T 

(TONS) ' ' 

(ITS NET WEIGHT) ^ 

(WEIGHT OF SHIPS CARGO) 



yes 
TELL ME 

(the weight of a ships catrgb is tlie difference between 
the gross weight and the net weight) 

THE EQUATIONS WERE INSUFFICIENT TO FIND A; SOLUTION 

(ASSUMING THAT) 

((NET WEIGHT) IS EQUAL TO (ITS NET WEIGHT)) 

(ASSUMING THAT) 

((GROSS WEIGHT) IS EQUAL TO (GROSS WEIGHT OF SHIP)) 



(THE WEIGHT OF THE SHIPS CARGO IS 5000 TONS) 
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■" IfHS^m-^ '^ i. ^ w-^*-^-^'"- ■^e^W^^^^f^SfTTW ■*** >"B-J-13*, 



In this probleiii, t^e a<idl;tloa»l tlfo^StfCioH type* tn Oin loder 
case letters^wM sufficient Co Solve efi« pt»*llil«m. If icwaftf not, 
the queik:i(m '%K}QM'^^€peateiHilik:i%t^ gaM '^no'^t^ or 

provides safflidi^ttt inforrtafelsri fdt solatioa ot t^ie ptolileitiv - - v 

In the pfdblem belowv the sdl;ueiidn td tHif ssecSof eqtia-" ^ ' 
tions involves solving a quadratic «q«deidn^^iohia";beydildt:he 
mathematical ability of the present STin>ENT system. Note that in 
this case STUDENT reports that the equations were unsolvable, not 
simply insufficient for solution. StBtlM itifl tf«<|ttelits?Mdditional 
informatien frop ^He;(l<4e8tieiwr, ip f?he Wgp»Xg», tfif /qttgiBij:|.^ner says 
"no", and STUDENT states that "I CANT SOLVE THIS PROilAfn and terminates. 

(THE PBOBLEM TO BE SOLVED IS) 

(THE SQUARE OF IHE DIFVtigS^ Bgy Mijm ""ffr J'Jft*^'^^ OF 
APPLES AND THE HQKBffl <:FWkm^Wi:^1^m IS EQUAL 
TO 9 . IF IHB mjaaOL or tSVlMS is 7 . FIND IHE NUMBER 

(THE EQDATKHIS TO BE SO|^^A|;B>p,. ^ n 

(EQUAL G02515 (NUMBER OF ORAlKaSS OS TAB}.E)) 

(E(^L (NUMKBR OF APPUES) 7) , 

(EQUAL (EXPT (PLUS (NB|f|Hjl OF>^;W^j( -(SIN»S (NUMBER 
OF ORANGES ON TABLE) ) ) ij ^) 

UNABLE TO SOLVE THIS SET OF EQUATE (»IS 
TRXIWJ, POSSIBLE IDjapWS 

(DO YOU ¥SKHI ANY MORE RELATIONSHIPS AMONG THESE 

(NUMBER OF APPLES) 

(NUi^E» OF ORAIK^ 01^ TAiili) ' 

I CANT SOLVE THIS PROBLEM 
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.;i^-^ ?'-0:„jP^^«r -V-^i ^ -. «^ -.,ri*;^r^*a»,^--^.ftaW,j^S&^.*aiS^ f■^ «^ « vv.**^ , .^w,..#«,iaB|«l|i^gfc«Cfe .;^*^^'^5^ 



H. Summary of <;fe«g: gT^EtCT g»Jwe<: i^ gogJAgh. 

The subset !<*f fifigllsh, Mipdeysfeapda^ej J^ Sf|^PIT;4s b^iilt 
around a core of sentence a«ui phrase fj^niat*, i^i^h pitni be jtra^fonaed; 
into expressions in the Sa^MTded|j^t:ivpTf«wi#Av>OofeH basi<; 
core is built a larger set of formats. Each of tfefsf arie first; .ts#naT 
formed into a string built on formats in this basic set and then this 
string is transformed int© ,an expre«sipn ,ia the 4i?4"Gtiye model. For 
examp le j the f ora^t <$. Ig; KJUAL TQ |) i«55 ch#nge4 t© thfs ^§si^ for- 
mat ($ IS $), and the phrase "IS Cp|P8G^3T^Ec^X>^^^^ to 
"IS 1 PLUS". The cQnptruetipns discwfsed fajrlier 4i»rp lyings single 
object transitive veybs could havf be^n ^andlfd tfeis way, thpagh 
for prograiraning convenience they wey^ . iwt . • > , 

The coB^lete list of the basic foraiatpiacceptedjby the present 
STUDENT system can be determined by, (^xwinipg^Cift^t^erlirpgr^Wrlisit^ ;, 
ing in the Appendix) the rules from the one labeled OPFORM to the one 
labeled QSET. The HEfEOR rules of the STUBBST^'ft'oirM^pi-feBisely 
specify the acceptable fpimats, and their trans |«ti|f»||s to tbe model, 
but I shall try to summarize the basic and extended formats here. 
Implicitly assumed in the syntax is that any operator appears only 
within one of the contexts specified in the tai(ie-|i-^Stt^ i^^^apter II, 
and only the operators given in the table appear. The listing of 
STUDENT starting at the rule labeled IDIC»«S gives translations of 
additonal operators to those in the table. 

The basic linguistic form which is transformed into an 
equation is one containing "is" as a copula. The phrases "is equal 
to" and "equals" are both changed to the copula "is". The 
auxiliary verbal constructions "is multiplied by", "is divided by" 
and "is increased by" are also acceptable as principal verbs in a 
sentence. As discussed in detail earlier, a sentence with no 
occurrence of "is" can have as a main verb a transitive verb inmedi- 
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ately followed by a number. l^tlAAim^er BR»t£ Miu^^y^^^ 
phrase which is the direct object of the V**bi ia 1^ "^ry has 
three gappiea"^. This type of transitive Verfe can also have a compara- 
tive structure as direct objects e.g. .^Katy'^yia twice as ntany 
guppies as Tom has fish". ' 

This feorapletes the repertoire of declarative Sentence formats. 
Any number of declarative sentences may bidbftj&ined, with "iahd" 
between each pair, to form a new (Compiek) declarative sentente. 
A declarative sentence (even a complex declat'atlve)''ican be made 
a presupposition for a question by preceding it liri'th "IF" and fol- 
lowing it with a comma and the (jUestion. 

Questions, that is, requests for information from STBDElJT, will 
be undetstobd if they match any of the pStteifns: 

(WHAT ARE $ AND. $) (WHAf ^S $) . 

(FIND $ iIND 5) (PUro $) 

(HOW MANY $ DO $ HAVE) (HOW MANY $ DOES $ HAVE) 
CHOW MANY |1 IS $) 



This completes the summary of the set of input formats present- 
ly understood by STUDENT. This set can be enlarged in two distinct 
ways. One is to enlarge the set of basic formats, using standard 
subroutines to aid in defining, for each new basic format, its inter- 
pretation in the deductive model. The other method of extending the 
range of STUDENT Input is to define transformations from new input 
formats to previously understood basic or extension formats. In the 
next chapter we discuss how this latter type of extension can be 
performed at run time, using the STUDMT global Information storage 
facility. A combination of English and METEOR elementary pattern 
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S^^SB^3^W«^5SgJgc;t*f^^-'-=^ ?^l|ggp!ffl8ffl®^l»«s«AMS^i»%v?«f«B^^ ^>^*i 



elements can be used to def tn^ fefee tnp«t fonwt .«»d tranifomation* 

Even if a story problw i» istrdfeed wt«5h4mT6be.»i|b8«t to f English 
acceptable to S^HttlKlIT, ^thi* isnet # g««B«nfeee T«*«€ofehi» pKoblem can 
be solved by STUD^X (^asiimingdit to fee »oly«|bl«)i« ,Z»q pb^asea des- 
cribing the object munbevat wor*t;joii^ '^ftligbtly^Jilfinstiifc" by 
the criteri» prescribed eerlieri. ABUx^fitrji^^e glo^l tnlarawtton 
must be available to ^IJMDENT, end the aj^ge^a i^Wlifftd flWst not «ci 
ceed the^ilittee of tl^ »©lver. How«y«r^ thoagfeiwmt algobEa etocy j 
problejns found i^ tbe «t«5|dard texts <y|«Mfie^ be s8<itv«4 by STOB^T «tad£iy 
as written, the author heiJ ii*i»«lly b»eaa*rI«*ofiil* •e«er«|aB«phra»e .i 
of almost all such problem*, %Atc|i,i4Ji#ot^*te^ hy^STmBSt. ijAippeaidM iDn 
contains a fair sample of the range of problems that can be handled 
by the STUDSJT sys^iw. ., _^^,, . .^ 



li — Limitati^s Mr..tim iSl ffl)lHT.aub««a jaf ^Mtlahi 

"S^^ t^<dmi^^9»fS^m»l|^«4i^/tbi.maSu^^t^l^asm ^nsieraL and caos 
be used to e^felei :fl -eooviOter fMraSTiam tq^t«ea«ftt ^^ 
fairly extensive subset of English for a fixed sawntic base. How* 
ever, the current STUDENT system is experimental ai^ has a number of 
limitations.; ,,.-,-.-. . 

STUDEJET's iiit^rpret?»tioo of the iapufeji* *«»«! .dn fojcmat • 
matching. If each l<»ipt& 4« fttii«d «o extHraMia. t^hft^sMming tmieratond 
by STUDENT, no mi^lfflterprfitation vill oc«aac^. >lfow*>rtfi5^,/^ , 

occur %a Engilsh lii^ourse even i« eig€^b»^a c«i^y ^tibAwm^ irn seaantic > 

contexts not confii»^mmr^tA»iS^^mtt^m fiifitwqpBiltsttim ^k£ utiles •r(&ir<'>cn^^^ 
mats. For exaitq)le, a sentence matching the fonaat *'($ , AND $)" 
is always interpreteji Iq^ KBH«H tm tlwftsdte^unctitDreio^ itwki. d«ciarative 
statements;. ThieEcffa*i*» th» ftfsntence ^ffem haar* mpplsm^nB. bta^tiaw^^Mia 
4 pears.M vwwld be to#0rrectly jilwide* iiitaas^thft tlMKCl^ 
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"Tom has 2 apples, 3 bananas." and "4 pealrs." 

Each of the operacdr words shovri^in Ff^re 4 must' be used as 
an operator in the context as shenm or » TntsiHterpretatJlon will 
results For example, the phrase "the nuibfer 6f -tfeaes I went to 
the movies" which Should be interpfeted a* a vai? idblfe string will be 
interpreted incotrectly a« the ptfodotdt of t?he two variables "liiinnber of" 
and "I went to the fflorvies*', because ''times" Is aM^a considered to 
be an operator. Simil»rly> in the currertc imple*ient*attcm of S^fUKQIT, 
"of" is considered to be an operittor if it is '^eeed^ fiy any ntnitber. 
However, the phrase "2 of the bOys who pa»#«#"=^ill Ife Tiiisittterpreted 
as the product of "2" and "the boys wh^ T^^a^sed". 

These examples obviously do not constitute a cbteplete list of 
misinterpretations and errors STUDENT will make, but it should give 
the reader an idea of limitations on the STDDENT subset of English. 
In principle, all of thes* r f strict jkasf;;?^ 

removing 3S cane of them woald ij^iqaire ^nflyifmla^r ^efeaag^to the^ p 
while others wouM ri^ui^r^ tech»^iquesn«^l^iisefd>^iWi% 
system. - 

For example, to correct the error in interpreting "2 of the 
boys who passed", one can simply check to see if the number before the 
"of" is less than 1, and If so, only theto- lsie^i^«*^'t^* as «tf 
operator, "times". However ,j a twich more ^t^iWtltetftietf grammar and 
parsing program would be necessary to disttngulslt d4^€erent occur- 
rences of the format '*<$, AHD $)", arwi Cotretetly «!tll»ract sii^le^ sen- 
tences from con^ lex coordinate and -svU^Mkmv'iB'- •■ff^nMn^acieS: 

Because of limitatdonB of the sort di^&j^4>ed' above j land the 
fact that the STUD£aei3! »y»tem oUrrj^tly^ ocC4*pteir'«k««t «li' of the 
computer niemory, SaUDEKT! serves' prlnclpallyoisfeHaraaiohstration of 
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the power of the techniques utilized in its construction. However, 
I believe that on a larger computer one could use these techniques 
to construct a system of practical value which would communicate 
well with people in English over the limited range of material 
understood by the program. 
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cHAPi^yf sTo^p^o¥(?^Ji^,Jmmm^ 



This algebra proble»-8olyi,ngr sy^tepi ^ivtaln« £wa pyogrfWis 
which process English input. One is the problem tJ\u?r ^^' *^Af*^5^« 
STUDENT, which accepts the statement of an algebra story problem and 
attempts to find the solution to the particular problem. STUDENT does 
not store any information, nor "remember" anything from problem to 
problem. The information obtained by STUDENT is the local context 
of the question. 

The other program is called REMEMI^R and it processes and stores 
facts not specific to any one problem. These facts make up STUDENT'S 
store of "global information" as opposed to "local information" 
specific to the problem. This information is accepted in a subset of 
English which overlaps but is different from the subset of English 
accepted by STUDENT. REMEMBER accepts statements in certain fixed 
formats^ and for each format the information is stored in a way that 
makes it convenient for retrieval and use within the STUDENT program. 
Some information is stored by actually adding METEOR rules to the 
STUDENT program, and other information is stored on property lists 
of Individual words, which are unique atoms in the LISP system. 

The following are the formats currently understood by REMEMBER, 
and the processing and information storage techniques used for 
each one: 

1. Format: PI ECWALS P2 

Example: DZSXAHCE EQUALS SPEED TDffiS TIME 
Processing: The sentence is trans foci^^ into an equation in 
the same way it is done In STUDENT. This equation is stored on the 
property lists of the atoms which are the first words in each 
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^■■■T?'^-^---:'^y*:'-;'-<-f^-;.i^.-i'f--^T-v-^^B^^ 



variable. In the example, the equation j' 

"(EQUAL (DISTANCE) (TIMES (SPEBB) (TaE)»" 

is stored on the property lists of "DISTANCE*^, "IS^EECi" and "TIME". 
If any one of these words appears as t^e liilrtial wdr^ of a variable 
in a problem, and global equations are needed to solve this problem, 
this equation will be retrieved. 

2. Format: PI IS AN OPERATOR OF IJEVgt K 
Example: TIMES I^ AN OPiSAtdR OF iMLl 

Processing: A dictionary entry for ^1 is created, with sub- 
scripts of OP arid K. For TIMES, the diet foriary en^ry (tlMES /OP t) 
is created. The dictionary entry for any word is placed on the 
property list of that word (atom), and iis' tetrieved and used In' 
place of any occurrence of that word in a 'problem. 

3. Format: PI IS AN OPERATOR 
Example: OF IS AN OPERATOR 

Processing: A dictionary entry is creatied for Pt with the sub- 
script OP. The entry for OF ia (,0f/0iPy. 

4. Format: Pi IS A P2 
Example: BILL IS A PERSON 

Processing: A dictionary entry Ha created' for PI with sub- 
script P2. The entry for BILL Is (BtLt/PiR&k). 



5. Format: Pi IS IHE PLPRAL OF P2 

Example: FEET IS TBE PLURAL OF FOOT 

Processing: "F2 is stored on the property list of PI , after 
the flag StNGi the word PI is^itored on tlie property list of P2 
after the flag PLURAL. Ihus FEET is stored iafter PLURAL on the 
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property list of the atom FOOT. 

6. Format: PI SOMETDCES IKANS P2 

Example: TWO NIMBERS SOMETIMES MEANS ONE NUMBER AND THE 
OTHEBi J^WMBER. 

Processing: The STUDENT program is modified so that an idiomatic 
substitution of P2 for PI will be made in g prob|.eia if it is other- 
wise unsolvable. All such "possible idiomatic substitutions" are 
tried when necessary, with the last one entered being the first one 
tried. The STUDENT program is modified l}v the addition of four new 
METEOR rules. Since PI and P2 are iuseirted as left, and right halves 
of a METEOR rule, they need not contain only words, but can use the 
METEOR elementary patterns to specify a fprpiail; change instead of 
just a phrase change. For the example shown, the rules added to the 
STUDENT program, as listed in Appendix B» ar^ the rule labeled 
C02510, the rule following that one, the rule labeled G02511 and the 
rule following it. 

7. Format: PI ALWAYS MEANS P2 
Example: ONE HALF ALWAYS MEANS 0.5 

Processing: The program STUDENT is modified so that if PI 
occurs, a mandatory substitution of P2 for PI will be made in any prob- 
lem. The last sentence in this format processed by . REMMBER will 
be the first mandatory substitution made. Thus "one always meaiis 1" 
followed by "one half always means 0.5" will cause the desired sub- 
stitutions to be made; if these sentences ve|;e reversed no occurrence 
of "one half" would ever be found since it would have been changed 
to "1 half", by mandatory substitution of 1 for one. 

For each sentence in this format processed by REMEMBER, a 
new METEOR rule is added to the STODENT pro^rgia, immediately fol- 
lowing the rule named IDIOMS. The format of the METEOR rule added 
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.„^,v,T,,-«,»^„,,^,,„_*>.™_^-^_ . . ,» •'-^ --«-r5S!«w»«u-i«r-»Mfr,\f--t*!jJpi«i!Bja<ijac^^ 



is (* (PI) (P2) IMfWS^iAdieiee-fellkiiyBi^^ 

tence processed. Thus by using a combination of English and METEOR 

elementary patterns and reference numbers in Pi and P2, one can add 

a new format idkf ¥fenTcyffcilK) the 1KR«B^ 

follcmits^ Bt^^ikmetA^ya^ pr<iC^}i^e^^ 

"undei^^atiiS^' ^t»r(^t»ielr%y t^i^iiii^rtt^^£p^[R6^dfiD»^ii£ ^i^il¥'^#1rfaj^^ M^th 

was "eX<iJseda"i ■-'"■ -■■■r'.--^-. a^■J to bn-. h»i;J Is s-if;?f= ^ni: --^r r-,.;-,f> ^•■ 

($ BKC^aJB $ BY ^ AOTAYS 1«M<^'^ ^If ^'M*»qitt(6l 3)^21 ^ 

This permanently extended the STUDENT input subset of English, 
while aVd€dli$ t1)e iciediM»iey of atetu^'1^'ed£«^g|f Oliid ^^»Mig&^ ^i 
STUDENT i>rojprdtt'.' -.^^J';' '-. -i.ii.rvr:;,> '.-d:! :-!,f Ty3 l:-u;.i.,r .QnO ..^.■^^- 



The grl'<^1>i3 ^ tiif ovtwat loti stoiried ^fdt ^SfOHOfP %^ni^{^ ittii '^ci<^^ 
tions t«i"foK^I:-'^dfta^|«»^-«<^'^t^titfaF'^li(n^ cl^^3i(»J|M#^e :u^^-''-'^K- 

of the-'lffifECOl 'j^rototjii**!' vM^m^drA attli* tiM We '<»€ ^lflhS^-%en^l8lli^*'e- ^ '-' -' - ' 
procecraltigj opinratibfa^'lilt'l^iSF ^lMilitt:<^Mt<l%ird^«ili^ 
storage stid c«t^ri«va{: of thin w:^B'#«iilie 4^ ^iiif MteiHii. ^^^ it^n^^^-" 
dix C is a listing of the global inforswtJbcMt >«^HrT«li^3r^^eiiS^>t£&id^^^M^ -^ 
the STUDENT system. 

^ . I.' - '■ ;- :..- ': "''W :J/i3 T_,i '■■•^V-io,,. ' ^< '■ .•- :.„v.- ;.■ 'OK 

.:HJ;!AVJ0H;;1) :iijj::v --,-; /' . • ' ■. ; ■-■ ;. 
'.j:.. ' . J; ij ./iJiOt; i- t ~';'' '^ -■''';-■ ' -^ ,"':;'.!•'-; '■ 
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cmsPTmm: wumimov smiLTrnmsmwrno^s 



This chap&«zr cpotalns a <iesQcd:p^i|pi^ of ^t^ 
used by ^n^Sip^'^ »qlAEe .9e^9l iil#;f#|n^l^ The; de-- . 

are shown in the figure at the end of this chapter. This desqrip^ 
tion of these functions is essentially independent of a detailed 
knowledge of LlS^f #|p|ioi^:^p<»:9iiiona'jtf;1im^^|:tf!|lieit^ 
be directed to the iiu)re knowledgeable. 

ThjE§ top l«vel fujieitioni SQIiVS, >s a fmic|ti<Mi of t|ire^ jajTgu- 
ments. One, labeled EQT in the definition of SOLVE, is th#, fine*; of 
equations to be solved. The argument labeled WANTED in the defini- 
tion is a l^M^.ftf v|U|4^b«lei« i^Qse w#li|i*raK# jf^^i^fltfl., ij^iq |^^ 
argument, Aal?eart»4rTEBMPt, M J^th^r^k^n a>M>!Vfi^fi}^ 
joint f ropft HAin^, ; jSCBUigE wiU ftn^, the \mkm.M «»X yari3|l>lp vwhich 
is waaft^ in jE^rms oj, «ffl»y or 1%LI pf #fe w^^^^ie*; !0%/|<Jie lie^t TgiBMS* 
In use, j:h# liiSt l|»iftni*ia: Its^t o^; *pifi«t»,, mw^^ m. pwpiMfebj or feet, 
which m^y .app««E-,-iB 'tljte. answer,.' .;-, ,Tc"i ;: ^■,.[- 

The output of SOLVE is dependent on whether the set of equa- 
tions given can be solved for the variables wanted. If no solution 
can be found because the solution involves nonlinear processes, SOLVE 
returns with the value UNSOLVABLE. If no solution is found because 
not enough equations are given, SOLVE returns with the value INSUF- 
FICIENT. If however, a solution is found, SOLVE returns with a list 
of pairs. Tlie first element of each pair is a variable, either on the 
wanted list, or a variable whose value was found while solving for the 
desired unkn(nms . The second element of each pair is an arithmetic ex- 
pression (in the prefix notation shown in Figure 2), which contains 
only numbers and variables on the list TBXtfS. Thus, the answer found 
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-^i^ilg^jl^fe^ 



by SOLVE is an "dwsdiilatlbli T^t^'^xS^ i^lf^ 

in the proper terms'. .■',A:v:=l i^jj '>>;■' 

For eocample, fet us con»idei?'the isiet of' Tj^Vert Sittuft^ 

equations shown below, and stfppbS^ ^JJCA^^iiiE^^^i^dtf W sIjTP*^^ this 

set of equations for x and z. These are given in infix notation 
for ease of reading. "' ■'^'''' ■"■■■■'' ft;i--t::; ■ -^q ;" -::■' 



(1) X + w - 9 (5) Ik •f' iy *^' 

<2) x^ - C - D « <&y' y^ - 39^+ V - z' 

(3> C + 3D « 6 (f) 4jc - i^ - ^ 

<4)^ '2C - "D = 5' ^' .■■•^^■■'- •=■■- :-^:^:- -.^-'--n 



"Hie list THftiS is enq)ty, antf thus the '(/atueV m&t 4^11 be num-' 
bers. In this case SOLVE would return with the titft o^ pairs 
"((y. l)(x, 2)(z, 0))," which indicates that the values x - 2 and 
z = satisfy «ii;s set of tequatloffi^t^tfcB«^tt^b^ iCT^ tMs let 

which were used^o detetnlitte the Valtrtfes) . 't^'^all4"jr* 1 Wi^ 
found during the solving procesis. " '1 . ' 

Most of the work of SOliVi Is abhefey the'fbniElioft'Soi\M. 
SOLVE transmits to SdLVSKth^ list or^tAiitir'^rtlMisrthfe^^U^ of^ 
TERMS, and a null association list (cattli'Aiis^^ifcft'ls'^ecur- 
sively built up^ tt) give the aniwetfi Th6*valu6^bf"S6L^ iS ' tfiis is- 
sociation list of pairs , «rith thteifi^st'llirtenl oi^MM^patt' 
being a variable whose value has befeti^fbiiiidi fhe^i^libSd elem6tit of 
each is an ai*l€HlBetic expressldh if^HiSft^iaf ^8SiitHn-iny"VMri^ "^ * " 

on the Hit nsm <as was the cSS6 f6r^tfii'A£iS''dl%£ii^V'^ However, 
it may also contain Vajtiables whith af6 fli-St^^teiltiti^of f>Mts 
later on the associatlbtt list, tf Values 'lot*"vaftlSfis ^Iv^ri by 
later pairs are Substituted into this arlttMetM ^^rlSslbh, one 
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-■%i,-5ffii>iWBpJg.-i ■.'■^-' 



gets the jr^t*p^i<^^jq^8ft^:j|^vr«^ ; 

variables on the list TBBMS. In the exanqple, SOLVER i«i^^44 i;^;?^, 
turn with the association list ((y, (4x-7)) (x,2) («,0)) which 
gives y Ijn t^rn^ p| x» S^^ inskp^ t^e ,j|fi%|)^^ ail(d|,*ia^ltflfa- 

tlon on^h^ «i|^aayt^<M]^,;U8t ^^fMffP^ .woi.:; -.n-^: ^. .:■ 

SOLVER is a program ^ich solves for a list of \ia\ted 
variables. It does this by choosing one of these variables, adding 
the others to t;h^ lis t^ of tacps and calling SOLVEl t^ #0|,lve for this 
one variable^ 1^ t^^ps (ff th^^ other wanted varifblesj an<f j,th^ (C^iginal 
TEIQIS. If SOLV^l. «uccec|d8 i^a solving for this var^b}^r» SGff^R 
pairs this one variable with the expression fot^uLj .jput:&s<thi^|>air 
on the end of the ALIS, and using this substitution in every equa- 
tion it tries to solve, attempts to solve for the remaining wanted 
variables. If there ,^|rj^pp |»Q?:f» ;S^;L!^B^^8 .f^^i^i^ ;iiaif|Cetttrpia^he 
association list built up. . < 

S0LV|1 pplv^s fpr.a singlje.wi^i^^a^ y«:4,f^^e by, :ttnd,^ ap 
equation containing th^if^yar^ble^„ fft|tr «^ j^i^a^^j^j^i^^ of. 
values for variables listed on the ALIS h#ye bl^n, jqafidiet. , Xt then,, ; 
makes a list of all the other variables in the equation, and checks 
to see if tbf^e ,arf ayy, pot on t^e^lJ^t^-pB^^p If f^ J.t^jWilJL8 h 
SOLVERto splvg.for^thfse new v^^bfes.i^^^ 

variable and, t^ie ya)^J.«)^^e^j.ia^ TpHS* If ^U^^ iy i^jpiiuccpfflfj^li, 
SOLVEl tries to, f4.(»4 j^Qotheir equat;^n c^pj^f^i^ji ;t|b#^ tranced vfr^ablp> 
and repeats the prppess^ If jtj^^re 4s;nonj8^ ^OLpl l?|ip; the yal?»e . , 
INSUFFICIENT,. If SOIfVER if suc):e!»«f>»l«da#dVi?J»(Bf fpi: these n^w 
variables ay^ fpund, pr i^ |l;tjere.fe??e- ^ W^w^yf^^al^lff ,^SpiV«J 
finally Mllg SOLpg^^h^^l^f^ei^^s )to fgJLye Jshif fqu*ti«wsi«l>r J:he 
wanted variable. ,^I^ ,t;l»!e,e<juat^9B ;Lf,,lin«^5 jar fthif rVayigble. 
SOLVEQ will be swcqgf^fyl an4^ give f fo^tion, , ,Sp][,yi|l;will add 
a pair consisting of tlje wa^tpd varifible ^n^.t^JL^ ,y?^e to the en<i 
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of ALIS, an4 retucu wit^i thifljaogitoated #fiIS0*8ntt« Vaiu4* flf 

SOLVEQ is unfapce»«fuli7S(«EVBi tztieSi^aiuiBMdfJHeqaieioA, bbatcChah if >.iw 

no solution een be rfemad S(n.¥Bl >i!«taima ehebvi(itt#.4i(|dL«4B£E'' - Kj 

'■.;^.'- i ' . . w:,^> ,-•./•.: v-jV. = CP .A^ to :,TT-r'., ; ,i 1 £uf',uoi . = ^..7::}' •£ ■:■. 
This deacriptioa h«8 been a^^rat^ar lettig^indfKLi^ateadipC to J 
exp lain ;|*st one ^€ge >tff LZSE ^eogiramnat rtite ifweid aofisc&ia '>oliapt«* . <- i 
To make it nore 4^pt«al£ic, ileterioa xlonsided lAaeqbfppMaEiwlic^ oSOIiVER' ^ 
tries to aoive «4ie }8«C of-eqoatiottffl liel«iiJ|l:iic(Jsattei:«^«d «rti««in ij h/dLa 

earlier) ;:;..- V :r\.:^^~.,.} . X - - -'J) ,^)) ^^i^k :'^=ii .:0 -■,,:::-Hr; ^..)-r.,: 

...,■_.;,- --.::-.. 1a^_ ;._:...■" CMC- B . -; U. li^rU .C. ^uic>^ ,JV.;-.:- 

■ ■ : , XI) ■ :X.*.W*.9 : ^i^-,: ■ .■.^, ^-, .:-^S)".:. xf+.jZy '---iftJ^^ :■'■ ■.- - • ^ ^ - '; 

(2) x^ - C - D (6)i(:^s4 3y:4>2 *)a J J^ ^^ 

(3) C + 3D - 5 (7) 4x - y - 7 

SOIiS^S is Aalwcb Go salvse for ;X and a^<: ^ ^tiiaah«^^OB^l^ to 
s o Ive for x= in ifoenaa o* « * S(M.VS1 cptekSs ^sqiastiim f I)|^ 3feiiiite tKa t ' ? ^ 
a new varlaWft». Wi.iia« a|jp|Bttcp¥i. aarf eari» SCOBElfeJto a^ ^ 

in terms <>£ x ^asd' z^i <iSim» tlmxia^la ab oUw^ itt ' . > 

this set ^ ^L^^mt ilft toMtcar^aiuI ^and 80BBB t alwwiJQ^ ^^tloii < 1^ , 
and goes to equatloii#2) . Ifere it <»lfia' SOWlRife , v ;; 

two new variables C and D in terms of x and z. In thia case 
SOLVER is^ successful,. uBiin^aqjsat£aailKl^^«irf>i(i5k>i^.1wC 
values are. subatiCut^d Jm mpsat.imi i^, rSOL^O^^taeeavit smlvm for xi . 
becauae the^ equaticwt 1ft not! lilnear iatgt^>.?p- ; r ; 



,r:!r 



SOLVE Int^ abandima equation C2> and t:h^nr^»Hilt» lit <$bta!lned 
as aubgoailfi for, solyi^ig} (2)^;.,'^t finds an. oocucrebte tt€ x ligwin 
in (3). Again it QtilM, agt,SaMEAi t»> Solve #ar/itd» nekv^rtabiie 
^ in termsr^f X and x^. flCSi^XbJbries tsD uale <J6Kv bal:}i80ii«E(^nQannot 
solve thijft) equation for. 2.; IJs^iitg <7-)dS0IJnE» ci%tu3»a&«ctth 'atn ALlS 
of ((yj(4x - 7))). Using this ALIS, substituting this value for y 
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|B»3J,' 



eiSr- ■ - ~ 



into (5), SOLVEl ealls 'dn JOI.VBQ be ael^»' thtiiiieqilAtlon' feif Xy 
which it docis, arid iinalhT^jSQihVEhiTetwtwi tmSKf&ltMi'itt'ha ALIS ' 
((yt (4x - 7|)^iCx^2>X wfai^h does give ^feheJilMlbNii df «UftCft«x>«s 
of z. Having found x in terms of z, SOLVER will now call SOLVEl 
to find the v<|lae of «; SOIiVEl f idds tan oeears«nc0^«il2 in 
equation (6>i. and afifeeiTostibstitmttenAfrifeeipiUia^ii Cbe^^ilLXS^' SOl^Q 
is abl«^:t70(4N>lYe /this eqtxaCiofir forzy beeautfeCit. ii^}iia«a0 in 2. ^ ^ 
Adding the i»«ir (s»i9) te}» thmJiUMt SCemBl Ttt^rn* Mis to^SOJI9BM, 
which passes on this ALIS ((y, (4x - 7)), (x,2), (z,0)) to SOLVE. 
SOLVE, using the function SUBORD, which sub stitutes in ord er 
pairs on an ALIS into an expression and sin^lif ires', finally ]7eturns 
the ALIS ((y, I) (x, 2) (z,©)). 

This example shows the rather tortuous recursions that these 
functions use to solve a set of equations. Why should we use this 
tjHpe of solving pr»gr«B iusltead of jja more straAghCforwai^ mae^lk 
method? :l}ie , principal reason is tJiat, aftjilkowng noaiMneiEn: eiquatlons 
may appear in Che set* Ibc £bis case, ±f appg i ^pi^ teff values' can b<e 
found f rem other equi»tdons \ri»jc^ lAen istdis4rlt:txt«d <lntqc this non- 
linear equation make it Mnear iac&hs vaci^}-le>.£oc;adii3(^ %a wittit to 
solve, then SOLVE Wili f ittdi thst: value of tMfl; variable*; 

The method of operation of SOLVER 37)sq);^res that if n vari- 
ables appear in any equatri(|tt<, aod iAtat lexftx^^oa lis <<t8jBd, th^n at 
least n-1 other independent equations contttifcltig' tS^ese variables must 
be in the set of equations, or the actual mechanics of solving will 
not be stsitted. This eliminates mucK wox^ if -tl«ra narfe exirraneous 
equations in the set whitdx contain one or) two- ipsC; tSJ»Wa**ted Variables. 
However, It precludes solving a set of ^tvUiix>nB\A^iJc^^ is h£»9M><^ 
geneous in one um^anted variable, and wonld^l^finefore cancel out 
in the solution proeessvl^i* 16 the principal reason why fsroblems 
such as: , . ' ^ 
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"Spigot A fills a tub in 1 hour, and spigot B in 2 
hours. How long do they take together?" 

cannot be solved by STUDENT. 

This solving subroutine set is an independent package in the 
STUDENT program. Therefore, improvements can be made to it without 
disturbing the rest of the processing. The routine described 
here was designed to handle most of the problems that can be found : 
first year algebra texts. 
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FiRure 5; The SOLVE Prof^raa In STODEHT 
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^'^■■?^i'''rKS^;'»^*';Tt^'-^:'"*'*='-r^'W'-'-E -t'-.TT^-'--^^^'-"--- L."*--rrta^.-;::^"~' 



CH4g!lER.mH fOOIK^^ 



A. Result^. .::,,■ ,-, ^ ■.-,-., 3-.r.-:- a,.'^^''rry; ..:i;,-v 

The purpose of the research reported hfcf^^f^s.-rj^f) 4ci¥e)QP b n 
techniques which facilitate natural language coamunicat ion with 
a computer . ^i*«naat tc t^t^iy ^of «q«fel5«n|jLd|.fi§f«5«ei1!«t JPI«»IIPae¥l 
as a bastf for the 4«si«» *||d :««4eisfea|!449g )«f ,«»#l «pai|r5in«^4ne ^ j - 
systems, "g^iis i^eorjiw^s. only ««|tlii|«^ i!P«l^i«d4i6iQnal wack r- . 

remala» to i^!^4oB«>> Howerve^, ..^ itAi,Sii|eteatn^9l^ tb§> ,:. 

theory, seryeid ^» a gui4« for Q««8ti*|qfet«nrqf itihe 5fiiiirE:8y«teDi» 
which can :^oaift(«icate in a Ilffil^^ sv^^^ft q£ ,^gl;y^«; .,; 

Ttkes la»^;»g^ jiHa*;l.y»i,ft iq ,5T0Btill!EnMf>n,,iiii»^)<e9#!|^attQn of th«s . 
analytic portion of this theory. The STDQBS7 ifty9fefiiT>JM>s <« y«7iy • 
narrow semantic base. From the theory it is clear that by utilizing 
this knowl^ige of nfefee dlii»it«4 J:4n8«'«!tii8e^wfeig jof . .^fe^ffc topwfe Aiscostrse , 
the parsing problep, ^«ccpi€«, gi»»fel» »illg)14?lle4efi8i(^ 
linguistic foriQ9^ tMt pt*9t ibet rec«?gOiij|^4, ; i^ i^^ 

parsing sy8*«a ^ere k»»^ jm any jwwbil ^^enui^iiQ rb#a^blthi« s«ae aiorr ; r 
plification would l»ccur, SjifS |^gg;^S(ti9 ^tbi^t il^ j9 ig«ei«#l l«ikgua^g« 
processort scw»« time ifti^^rM SMI*%*ti*ti4B« Jjje iiiiput,*ilt^ detMUj^t?^ 
context J^f&re going: ahead; *fifej»i -the ajsifewJfei? flrJ*»l^y%4*€ < . 

'Sie sema«it ic base^ of the Sinffil9P^; iaa^DSgcr nUi%iys i» 4s delimityod' 
by the characteristics of the problem sQlJVSyig i»yat)«« ss^j^dbednin At^o; . 
STUDENT is a quest ion- answering system which answers questions posed 
in the confcj^pt pf "3lg«^^ story Hpr,qbl^eiB»rc*VjiI%-i«lMai%l«o?(^ 
we used four crifeejri* for ,#;«^luft(;4,ng s^«AiialMl^eAj^:i(3«kT}^j«i«^il3g S^Srr 
terns. , I^t us csompsre the SWOmi m9^mBt MQ. th«*B ;^a(6li^s in fth# ligbt 
of these -criteria:.,- , ./, :'-^;.p ■ ,.. '^i. -'•-.v, '.u. 
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1) Extent of Understl^iaigy rltfll. ites^iaiifejquest ion- an- 
swering systems discussed analyze input sentence by sentence. 
Although a representation of the meaning of all input sentences 
may be placed in some comnon store, no syntactic connection tf ; i ; , . 

ever made b^^tWeen S6nt^c&§. - •''■"■'^'■s s^r-^j-. .-ri-u; -;] 

Ik the SWBWTI syd^etiiki ad ae&^mh^^ikfm H^smiquence of 
sentences^ sticfh tRat tli#Se ^ftetie^fieCa 'ed^fl^iit ^:®# &ed€v^t&&$ by just 
finding the meanirigs 6f the kidivi^hiaii-^i«nt4«ee#j; *'ig{fli»*riag *Weir 
local context. im&t-ieitt6ne§ ^&p4in44tici48 <vmifW^^ dee«fWi*eki j a«d 
inter-sentence syriiS^ilc r^la^itmiHipU *l#e B#-u»#i "fern tBi'S '^aate for 
solution of the proble*^gt^nV lB*ifr^ext'€»Ti#ion"ofi<^tM'^*^*»*ISt*c 
dimension of understanding is important because such inter-sentence 
dependencies (;§^giftftii^'^sm'oe 'ple6nS^^yiir§-v in 

natural lairigaage '*edmlii«Wfca't4©«.'^''i'^ -'-'i' ■ ■ i- v;;:! .iZ-iJ 

The sonant ie ^de 1 iM %lfi^ WJlWEWB' »^i>^ fs ^b***^ idh -^te 
relationship ^(^f^lM0 Aan^'^lf^^ Ittfiie ^sH'mSlam^'^^^Saem^rmii mdm- 
position of these i^tHimit(MifyMlit'Wim:i''§<M&4±4m»^iSt^ *re also 
expressed aairtdtVidi»fttfll»g#i*Ptcl<«?tisci»t1«B'>^^ 
The inf>ttt l«nguslge l#;r£cl¥er i^%]#p¥«^iiMg "^fti^ti^sN^ ^atl lrihd!B^ 
or Rdp^a^l*^ 85^r<ran.i'' Tlie ti^gfieMl- W^^itt^ 

relationships (predl€aee»> alV^3>wemie i* tliiB*^ttiihlt j'iMME^Jdtis Wc^ »Mo*r 
any composition of these predicates. The logical combinations 
of ptp«ii<:ates used =aif^ i^lf ths^eee e^i^iN^ili^^'WR^^'^^^^iii^^^ifa^iceH 
combinations- puffing Ktttfi- -'id* jv'^te*)'. --'^ ■■•''"' ■< '-'^^ -'' s:>'.-:..,vy:. :■■&::■- ■•- 

The deductl'^ syst^tf 'in ST^aVF, las/^in'XEnd^^ atld Ha^^ael's * 

progrsttttS, Is d««l^n«^ £c^'eh^!«7p«K of; «fiie^ W^sm^Ueid. It 

can ^oaly d«duee dti^H^EHrs '^f'la ci^fcMM xeyft^i^^iiikiw tlllB< i(^|l^SC lhi»rraiat^ion, 
that is, arithmetic values satisfying a set of equations** fe per- 
forming its deductions it is reasonably sophisticated in avoiding 
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irrelevant information, as are the other two liMltAtfJiedi; It lacks 
the general power of- ra -logdJcal syatieBi^ hat iisSt much jaorfi: «£ f icitent 
in obtaining its pairtlcular clisffla-ttf dffihi<y:i»n»ir„fefe«^-jj»uiri be a 
general deductive syetCTi atillzing the aisiamsl^f ateiMtoetic. 



2) gaejldty for ScteadioR Abtliti^w Extiauiilag itehe sy^:aet±c 
abilities' of any of the othear>:q«est4b«B-aiiB\rariH8p'45tstanis discuissed 
would require reprogramming. In the STUDENT system new definitional 
transformations can be introduced at run time without any reprogram- 
ming . The; Infiormatlon cotaeferniisg^ -these ■^^TOTfeEott^^ Iwi in- 
put in English, or itt a con*i'ina4rtofl Idf l*^li*i» sawd t^ffiOJl, if -that is 
more appropriate;. New syirtzactix: trtanafaaaraatisHiffgmjmt: Jbe^radddd by " 
extending the program. ^ < ';is 

ISie .«eman*ric base of the STUKHIT systwi Man toe iextended cmly 
by adding new parogrami as is tittie af' t|i« otiser rjfue*tiari-*answer-ing 
systems discussed. Ifcwevea: STl^fflil ia OEgBaGtaEaa^ittonfcicllttate 
such extensions, by. mlisimisffing tfee iiite«acEbiae»a!8s:©f'difi^^ 
of the program. Hie aecessaary ioifionaaSfcdon xi^ed jaaBlyi be. atMddt o the 
program equivalent of the table of operators in Figure 4, in Chap- 
ter IV.' ■--■:^: ■ - ■•■-■>•■ --TOV- --:■ -■-. Z^- 

Similarly, the deductive portion of SHSDHiP^*. whdch solves the 
derived set of equat iotts , is mi indeffend^it: [p^3kag«|r Baerefore , a[ .^ ' 
new extended solver can be added to the -syatert Sby^Jiest re^laCii:^ 
the package^ and maintaining the inpitrt:-'mttpiit ctearjfccifcerdatics of 
this subroutine. ■ - '^ 



3) Kacwledge of Ii^eraat ^tructare Hggdcd biy:B8er* Very 
little if any internal knowledge of t\m wHfleia«« lof ijHfrrSTKiElir 
system need be known by the ttser. He must; liav« ^ tfirmi grasp of tlie 
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type of probletn thai: fSSUWMT can 8o33ir£» nuia loio^lsdg&nQf Srhe itiput 
grammar .< n for eOEenqxlw, he nn»t bcf amaemthaa^:SSimlBma^(flmmmmriamt>i 
always be ua«d tfo J^efawtento t£ts» #aie ivarislile ^Ixl et <paMLtat^ viiiMjk 
the limits of s^tb^ikscMy Idefirswali earll<sri^ sMe lus&Je^^eJtfac^ 
even within these limits STUIMeNT will not recognize more than one 
variation on a phrase. But if the user does forget any of these 
facts , h« caw stiJM ji«ri^ l±Bc>fiy»6m, 1^ 
in the :ji«Et' SBCtiEm> %14<>«8^11dm to -moke rfla^dia ffoa: ^a^iindnt 9ay taistaika, 

4) latexacteion JfeHA the iM«a£.i ; JTh^ iSSQffllQiX sgffltMttiia r ambetlded , 
in a; tiiiwHisharittg^ (iflil4ihMm««ll4ttel£EIdE^ toknerBhafcliig . 

system (13>.))ji aid<<ith^±Si.grcctiljy rfaiCifld^ AatimettcMaii with tdie: 
user. STUDENT differentiates between its failure tota^tee; » r 
problem because of its mathematical limitations and failure from 
lack of sttf fdcloi^ iii^oxmatimt^^/^tti'lSttii ,%>£ fia£luE« oilt aaka tbB ulaer 
for addtt£cm*irtixt6amai^tdmii[Mad> maiggOMtmstiiz aa^ux9i-a&:Jtim necd«wl; ; : 
information ir«lat>£dnsfeip» san^i^ vaciilffifiUliif ^^tm^>f^hl4a!iii Ms 
can go cJsacki ta tbe iu«eitcrepeat3edil2]F ifor sJxif qonnti^js inntill it has 
enou^i Jto ^dE>b»^^ tbe pF«(ri»lte|jt}ri';uitfe:iiic£to osear ifM-'^^s uf • ^ 

STUDENT also reports when it does not recognize the fomat of i 
an input sentence. Using this information as a guide, the user is 
in a t«iachi^-tnachiiS» t^peCttttUatiaOfi jandf cad jquiieMy^^^^ 
STUDENT' s braikl of inp^ Bosli^fa,; a l^jpimcinilBor^^ >€i«e »aipmpt>±aaa 
that STUl^BSTTiMdces Jafaout the *ii^ut|f ja«l thE*bgl<*Bl;info«aia*i<m:it 
uses, the usev <san ^joipi^the Myj^mA WB^ixmttKi"^ 
an unwanted ambiguity, or add new general information to.febeii 
global information store. 

The xsrucialjfB^ijrt teLtfete user iioeetJW^jim iajth^'STSEOBST Is 
embedded £tt ittti' Ittti-^lzibne t iawnshaxii^ ayattnl jivA loan thaB: :prantide ino£e 
interacitidn than ian^ of the otfaier sy«t:ems;atentian»#. 
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«fFJ^^?^^;|S^»!Sg^?Pi%5fel^=*^^^^*^^tS^s3^^5!??^*l*^^#i^!^ ^■■-',-*^-T ''"<5^«»;s^t*r;g?5%*s?!^..aWS««ife^S'!<rr|iY^f!^ 



B. Extensions. 

The present STUDENT system has reached the maximum size allow- 
able in t^e LISP syst<^ on a thlrt^y-two thoif^nd wpqd IBH 7094- There- 
fore, very Ifttle can be adde4 direftt,ly to tJ^e p^eae^^t system. All 
the progranining extensions meii|tionedher^^,|ur;e prfdlcsted on the 
existence of a much larger memory machine,^ 

Without inviting ^ny new t^chniqvies^ I thlok, that the STUDENT 
system could be made to understaiid most of the al^l^ra story prob- 
lems that appear in first year high school text books. If new 
opera tors , new; combina t ijpns . of aritlp»e^4c opic^r^t ionis occur » they 
can easily be added to P?^OMI,th^ SJubr<wtln^ \^l<?h msps the kernel 
English s«aitence8 into equations,. 13je fli,ipii^r .o:f f amnst^ recog- 
nizable in the S(yst«m can be incrfjfk8e4 .wj^hqwj^t ?«jj:;«grsinmtng 
through the machinery, available fox stw^i^ ^lol^lt l;ftformation 
(this was di^scussed in, more detail in^-C^ffjt^r y).. /j^ proibleaBis it 
would not handle f re those i^yli^;«3^c^^8«^tyej^,€^rt ipg^li,ed 

inforqiatton about the wprld not eacpr©s*ible 4^ s s-ingle rsentience. 

As mentioned, earlier, the eys,t^ .(;an n/^ maik^, use ,Qf any given 
schema only once in so^yin^ a. problem. IJi is if ibe?*^^ the schema, 
equation is added to the set of equations to he fplv,ed, and the? vari- 
ables in tl>esch€3na pnly identi^Lad witJa i5n« oth 
ables appeairing in the pLroblem. J?pi: eptaijip^, li "dis 
speed times time" were the schema,, then "dis,t#nc«," « as a variable 
in the schema might be set equal tp! "diistance tray.ele4 by tr^iin" 
or "distance traveled by plane", but not both in the same problem. 
This problcp^ could b|e resolved by not, adding tb,^ jSph^jna pquat,ion 
directly to the set of equatlpns tp ^, splyed, btfjt ;hy Lppkitng for 
consistent .sets of variables to i4enit.lfy with the scjjyfw variables. 
Then STUDENT; could add an instance pf the sch;^(a equ,stipn;?, with the 
appropriate sijLbstitutions> for each consistent jset pf.. variables 
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found which are "similar" to the schema variables. 

At the moment the Solving subrouttiie of StUKENT can only per- 
form linear operations on literal e<|t(ation^,'^tid iiubstitutiofts of 
numbers in polynomials and exponehtials. It woutd be reflatively 
easy to add the facility for solvirtg ejuadratlc or* even higher order 
solvable equations. One could even add, quite easily, sufficient 
mechaftisms to allow the solver to perform thfedlflerentfiat ion needed 
to do related rate problems in the differential cafcuTus. 

The semantic base of the STD0H8T sjrst em could lie expanded. In. 
order to add the relations recognized by the S13K tystetn erf Rsfphael, 
for example, one would have to add on the I'owejit level of the STUMNT 
program the set of kernel sentences understood in SIR, t?ie it mapping 
to the SIR nradel, and the queatidtt-'antfW'erittg rchitine to retrieve 
facts. Thett the apparatus of the STODHfT rj^tem Wotild proceaa much 
more complicated input statements fottlief'SBlf n!6#ei^ ^^i^ sietious 
problem whidiairisesA»hen ttte smantic Irase id eneffenfdWd i^ Ijas'ed on 
the fact that one kernel may have an interpretation in terms of two 
different semantic bases. IPor exam]ple, *'Tbtn hais '3 ftsl^;" can 
be interpreted in both SIR and the preseitt S'L'&E^f - flty S% em . To 
resolve this semantic attbiguity, the prbgram can check the context 
of the ambiguous statement to see if there has been one consistent 
model into which all the other statements have been processed. If 
the latter condition does not determine a single prefferrerf inter* 
pretation for the statement, then both intet-pret at ions can be stored. 

In addition to these iinnediate extensions of the SflJDENT System, 
our semantic theory of discourse can be used as a basis for a much 
more general language processing system. As a start, dhe could 
implement the generative grammar desctibed ftt Appendix E to ptbduce 
coherent discourse— problems solvable by the STDSKflST system. 
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Another morefl/«lKeitl]^ jKJijHMlit^ of speak- 

er's model of the world to attack Yiigye's "baseball announcer" prob- 
lem. The baseball announcer has certain propo8i1i-|ons\added to his 
world model from the events he perceives, I.e. thk,b#«eball game he 
is watching. Mandatory application of certa.ta,j8.«Baitlc. rules add 
other propositions, and delete some that ar4ll**i«i,tl|&itel these 
changes aire going on, the announcer is to generat^eJi running com- 
mentary (coherent discourse) describing thi« jM^aLipiBe^^fefe watch- 



ing. By making the proper ds8un5>tlons aboirt^-«duHMfi the-^tfntion 

of the aimouncer is focused, that is, i^ich^<fH^^i@|4^^tEh ^ is 

i 
going to iuse as a base of his dlecoucse at any tim9,V I feel that a 

'.J 

reasonable facsimile of an aniib^^^r ean be pro^dpiedt This is, 
of course,, an empirically testable Tiypothes is. 

Another use for this model for generation andi analysis of 
discourse; is as a hypothesis about the lingi^lilQ lii»i|l.^«ijt of 
people. Psychologists have built reasonable' cc^uf|ef"S&6del:8 for 
human behaviour in decision making (17), veicSH&3li««liig<3kf nonsense 
syllables (15), axKl soae pr&btem 8olviisg'-> ^ a i4 ( Mi »| l(^%y~ STUDENT 
may be a good predictive model for the behaviour of people when con- 
fronted with an lilgeb:^& profrl«(p to solvie. This oaij; J;*^ tested, and 
such a study may lead to a better understanding oihiSiaBn behaviour, 
and/or a better reformulation of this theory of language processing. 

I think we are f«r from Writing. «program^idiiif h c^n understand 
all, or even a very l«Tge segment of English. gowaiKer, within its 
narrow field of competence, STIB^NT has demon8f4p^|||l|^t "under- 
standing" machine«if Cpn be built . Indeed , I bellevNer tlraft using the 
techniques develop«!d In thi« reieepreiki^^ne could ccmstruct a system 
of practical value sfhieh would Communicate well wltih people in En- 
glish over the range of mafterial. uBKte^stoad_ by thej^program. 
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2) Definition of the Function OFTOBf 
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REMEHBERC ( 

(PEOPLE IS THE PLURAL OF PERSON) 

(FEET IS THE PLURAL OF FOOT) 

(YARDS IS THE PLUH^L OF YARD) 

(FATHOMS IS THE PL»RAL OF FATHOM) 

(INCHES IS THE PLURAL OF IHCH) 

(SPANS IS THE PLUi^L OF SPAN) 

(ONE H^tF ALMfYS »iANS 0.5 ) c 

(THRE^ NUMBEI^ ALMfYS MEANS THE riRST MJMBER AMD THE SECOND 

NUMBEitANO THE THIItO MMBER) 

(FIRSf^^ TWO WMBER^ ALWAYS MCiMtS f 

THE FI|ST NUWER AM) THE SECOND IM«IER]t 

(MORE tHAN ALWAYS IIEANS PLUS) 

(THESE ALMAYSMEAMI IIE) 

(TWO NBMBfRS BOMETyMB MEANS ONE JtUMBER AND THt 

CfjHEiNU«^E«)| 
(TWO NimB^S fOMETlMES MEANS^^ ONE OF TW^ 

NUMBERS AND THE OTHEfr NUMiERf 
(HAS IS A VERB) 
(G|TS is /jJ^VEiB) 
iH^VE «S A VEK) 

USSS THAW ALIiAYS |«IE*MS;L$i5S!rHAN), 
(LESST*M(Njl5 /|« OffR/tOft df LEVEL 2) 
IPWCEMT IftAW 0«»ATOR«OB LIVEL f)'" 
iPEnCEIIf UESS SrHAW'^AOkAVS BMeInS ffERIifE^ 
%Pm^im r^A«OI#A^iOil|vELl)? g 
IPQJS IS AttOpRAWR 0F_ LEVS 2)^ ~ 
(SOH IfSAielpmTBt) V - "^ ^ 

(TIMES » *H »Eia»R Ol5 |1\^L 1> 

(DIVBY IS AN OPERATOR OF LEVEL- 1) 

(OF IS AW:>PERATOR) 

(DIFFERENCE IS AN OPERATOR). 

(SQUARED is AN OPERATOR >• 

(MINUS ISIJ&I jpPlRATOR OF LEVEL 2) 

(PER IS AHtOiENlTOR) 

(SQUARED tS AN (IPERATOR) i" . 

(YEARS OLiERtHftN ALWAYSt:MEAN&.^I^S) 

(YEARS YOUiQiR THAN ALW^S l|E/i£|-LESS TI«OU 

(IS EQUALtlO^lJUYS MEAllB 1$) : 

(PLUSS IS ANSKRATOR) 

(MINilSS II AW OPERATOR} 

(HOW- OLD IgWiWfS- MEANS W*AT) T ^ 

(THE PERlietER QF $1 RECTANiLCff|METIMES^«|A(t« 

TWICE THE SIM (HF THE LEMBTH I AKB)' ..W I DTH ^ Wi RECfl^tttLE) 

(GALLONS 18 ftE^PI^RAL 9F QlU.l^ili 

(HOURS IS fHi pIuRAL OF HOUR) 

(MARY IS A PERSON) 

(AHW IS A tElKOR) 

(81 Li ISiipft«|N]^ 

(A FSTHEi^lS A jFEiSON) 

(AN ONCLrtS^ fEisON) 

(POUNDS IS T^ 4^L^AL OF POUND) 

(WEIGHS l$^A}V€RBr V. =^. 

)) -: • .? ' r- ■■ ,/--r; 

REMdIBER -( ( 

(Dl STANCES JQ#l|# $PEEDtlKES -RME) 

(DI^ANCelQWI* «AS cefB»»*Ti«i^^Tr*<ES 

NUMBER OF GALLONS OF GAf Uilor 

(1 FOOT EQUALS 12 IN04ES) 

(1 YARD EQUALS S FEET) 

)) 
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APPENDIX E; A SMALL SEMANTIC GENERATIVE GRAMMAR 

The granmar outlin«Ki here will getcerate onfeg word problems 
solvable by SfUDENT,: though not the set of alf i^bh problems. 



RULES 

Create a set of simultane- 
ous equations: which can be 
solved by strictly linear tech- 
niques, excepi that ^ul^^titu-- 
tion of ntSnerlcai values in 
higher order equations vdiich 
reduce them to linear equa- 
tions is allowsd, The»e are 
the propoditions of the speak- 
er '^s model. 

Choose unknowns for which 
STUDENT is to solve. ^This is 
the question. 

Choose unique naiRes for 
variables without articles "a", 
"an", or "the". In the prob- 
lem any of these articles may 
be used at any occurrence of 
a name. In 3 con^lete model 
these names would be associ- 
ated with the objects in the 
chosen propositions. 

Write oae kernel sen- 
tence for eath ecpiation. Use 
any appropriate linguistic form 
given in the table below to 



E1 |AMPLES 
2i| + 3y - 
yi- l/2x 

y + z = a 



X = first number 

y = s&cevslti number Tom chose 

I « thir4 number 



f2 timeft the first number 
I^lus th^efe times the second 
number ftStn those is 7 . The 
second number Tom chose 
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■-- •l'>ii^-h-rr'-:-^J:~'^^*^^^'*S^^'^-r, 



'i5^'^i;.ia^'~lB'W^-^':- 



represent the 


arithnefeic 


functions in the c^tfa^ = 


tion. 


;■ - -: :, . ; 


For each 


unknown whose 


value is to be 


found, use 


a kernel sentence of the 


form: 




Find 




What is 




or Fit^L-i 


aM ,::; 



What at:e' : . ;.;; .annd' ■ . ' 
for more than one such un- 
known. 

If a name appears more 
than once in a problem, some 
(or all) occurrences after 
the first may be replaced 
by a "similar" name. Simi- 
lar nantfis are iobtainadj by 
transfozmationit wiiich: 

a) insert a pronotm 

f or . a ! aeup^ phr a««> 
in th^n^se. ,. ; 

b) 4elet« ijfctt i« 1 i«|id/ 
or t«7miQ«i,#(d^ 
strings of the oavam* 

Only one fju^h I'sip^lai:'' efefing 
can be u894^^'to r^flaf^B: a^ oc>r . i 
currence of a qapf, Ijhough 
any number . of ^©p lae#p«at9 - s 
can be made. 



equals .5bof ^tie^ fiest >^c ,_r M 
number. ^■'^'•' ■'■'• ^- ;2 -; i. L - ■ ■ ^... ' 

The sum of th«' sttoondiflttH'' . ': i ' < 
ber Tom cho8«iaiid'Jai)t3tied) "o: ibup: 
nuinber i»rei|»ullfi6d ith«f^ o^ ;;".;-• j-drr = 
square of tiM»f!l^»t3(tuii^i^^ 

hex. Wh«tf3iii^*h»^^tfl»la8d.^"'?''-5-: V-:: v; 
number? * » - . i ; ;^^ 

^^Ti "\' :■,:<'■ , ■ ! V B b ; . .CI 



Similar aErtlB#]' bio? r-....-^ i.^-i TPlHaUTr 
"first" fiaiei-gtti9itg-'ttUH*«rA'^ ■ ■ ■ ■ -••p- 
"second inpaltK ^' eht»«i'^':" ^-'^J ^^^''^ 
for "secfflidniauBdtwr ,fiim jr- ' .'! ^^n^^r 

chose" • 'Vi ?M^-.i. 



b-D J;59;ii.u;: 3rf n-i-- 



V r. '■ . • 



iuaiii'^a odJ 



■;si,3 
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If % occiwts inSj «nd 
Sj^j^, and in S it is the 
entire substring to the left 
of "is", "a«|tt«l»!' oic"ia 
equal to" (« t^e* entire 
substring to the jri^fe>, then 
^" ^ j+1' *^i "^3^ *« i»eplao€id - 
by any phrase flO^ftatetngrtbei: 
word "this". 

Any phrase P may be 
replaced by another phrase 
P2 which means the same 
thing. This would mean that 
STUDENT had been told of tshis 
equivalent5«,-iMi4iig WmStfBBM 
and the senli«ii«e- "t^ Mlvaya 
means P^" or ^2 sotoafeines 
means P, " . 

Two consecutive sen- 
tei^ces may be connected by 
replacing the period after 
the first by ", and". A 
sentence can be connected 
to a question by preceding 
the sentence by "If" and 
replacing the period at 
the end of the sentence by 



II II 
» • 



Replace "the^^^tteorid tuanb^f 
Ttm chose" by "fchtt second 
choice" in the third sen- 
tence. -'s^V'i m.'C.' /;■■; 



Replace "2 times'* *y "twlee*^ 
and ".5".bx'J«l» hulf'^'. 



Connect sent«iides 1 ^ud 2, and 
sentence 3 and thie final qued^^ 
tion to giver 

"Twice t*« «lirst «H^nd)er p lus 
three times t^e se^cotid 
numbetu^DiaB^' >fcti.6e* is- 7i knd 
the second' i^ii^o^r he cht}se 
is one ha l#;df the first. 
If ttMeidttiB oir fchiA^seebhd 
choice ana as thitanumbet 
is equal ^£<»eiie square of 
the firse ft^Hbieri Ilfhat is 
the third number?" 
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Summary of Linguistic Forms to Express Arithmetic Functions 

and the Equality Relation 



X = y X is y; X equals y; x is equal to y 

X + y X plus y; the sum of x and y; x more than y 

X - y X minus y; the difference between x and y; 

y less than x 

X * y x times y; x multiplied by y; x of y (if x 

is a number) 

X / y x divided by y; x per y 
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